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CORRELATION ANALYSIS AS A MEANS OF STUDYING 
CONTRIBUTIONS OF CAUSES 
by 
Walter S. Monroe 
University of Illinois 
and 
D. BR. Stuit 
Carleton 


INTRODUCTION 


When a trait or condition has been dem | 


mstrated to be a cause of a second trait 
wr condition, a measure of the magnitude of 
the contribution is frequently desired. 

jnen paired measures of the two traits or 
conditions are available for a representa- 
tive population, correlation analysis is 
commonly employed for this purpose. The 
measures of the effect are desircnated as 
the dependent variable and those of the 
cause, the independent variable. Tech- 
niaves of correlation analysis have been de- 
veloped also for situations in which there 
are two or more independent variables. 

If X» and x, represent correlated va- 
riables, expressed in terms of deviations 
from their respective means, it is possible 
to explain the existence of this relation- 
ship in terms of a common factor. This the- 
orem+ means that the two variables may be 
analyzed as follows: 


Xo = CoB, * Co 
Xi = CyG@oar * Cy 


In these equations c, and c, are constants; 
®> and e, are factors which are uncorrelated 
with each other and with ao,; ao, is a fac- 





College 


tor whose macnitude for any pair of values 
of xX, and x, is the same, or if the nature 
of x, and x, is such that this condition is 
not reasonable, a,., designates two varia- 
bles that are perfectlv correlated. Ry a 
proper choice of units, c, can be reduced 
to unity. Hence, we may deal with the fol- 
lowing analvsis. 


Xo = Ao, * yo 
X, = C4, * 6, 


As implied in the preceding paragraph, 
a causal variable is typically complex but 
it may be thourcht of as analyzable into two 
uncorrelated sub-variables, one (a,,) repre- 
senting the contribution and the other being 
uncorrelated with the effect or dependent 
variable. If the latter, represented by e,, 
is zero, the independent variable x, is de- 
scribed as contributing itself completely to 
Xo and may be designated as a component va- 
riable. The uncorrelated sub-variables are 
spoken of as factors and a,, is called the 
common factor. 

It is not possible to determine the 
values of the factors of two civen variables 
as indicated, but the meaning of the anal- 
ysis may be illustrated by taking sums of 
the corresponding values of uncorrelated 








l. For proof of this theorem, see 
Kelley, T. L. Crossroads in the Mind of Man 


/ 





Stenford, California: 


Stanford University Press, 1928, p. 38. 
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variables! as in Table I. If X, is thought 


of as a cause of X,, its contribution*is ob- 
viously represented by A,. Since the ratio 
of A, to X, varies and is dependent upon 

the location of the zero points, it has no 


TABLE I 
ILLUSTRATIONS OF THE VALUES OF TWO CORRELATED 


VARIABLES OBTAINED BY ADDING THE CORRESPOND- 
ING VALUES OF UNCORRELATED SUB-VARIABLES 











| | | | | 
Xo} = Ar | + Eo} Xs p= As | hh 
2} = 16|+ 9] 27] = 16 | + 12 
21 | ‘le | + 9 20 “ct oe 
e2/=- 16|/+ 6] 2e]= le }+ 10 | 
aa} = 15|+ 9] e20}= 15 |+ 5 
23) = 16/+ 7] 2] = 16;+ 7 
23} = 16 + 71232 16/+ 7 
25 | = 15 + 10 el} = 15 + 6 
27) = 19 | + 8} 29 1 = 19 + 10 
20} = 12|/+ sf] is}= iz}+ 6 | 
22 = 13 + 9 22 = 138 + 9 
17} = 9 + 8 i he | + 8 
23} = 14/+ 9f laf = l4}+ 4 | 
23} = 17|/+ 6] 2] = 17 | + 8 
15 | = 1+ OF Mi« 1+ 9 
2 = 16 + 6] Bis 16) + 8 | 
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meaning as a measure of the contribution of 
X, to X,. If the variables are expressed 
in deviation form, the situation is not im- 
proved. An approach to a meaningful meas- 
ure can be made by thinking of the standard 
deviation of the distribution of the values 
of a variable as a measure of its "magni- 
tude." From this point of view, the con- 
tribution of an independent variable that 
contributes itself completely to the de- 
pendent variable would be measured by the 
ratio of its standard deviation to the 
standard deviation of the dependent varia- 
ble. However, to facilitate algebraic 
treatment, the standard deviation squared, 
called the variance, is employed in prefer- 
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ence to the first power. The contributior 
of X, to X, can then be described in term 
of variance. Representing the variance of 
OA, 
A, by o4, the variance ratio —— is a meas- 
So 
ure of the contribution of X, to X,. Hence, 
the problem of determining the contribution 
of a given independent variable to a giver 
dependent variable may be interpreted as 
that of determining the ratio of the vari- 
ance of the common factor to the variance 
of the dependent variable. 


RELATION OF COEFFICIENT OF CORRELATION 
TO VARIANCE RATIO 
If Xx, = a, + @ and X, = C,a,, +e 


are substituted in the formular,, = 


the expression thus obtained may be simpli- 
fied to 
2 2 3 
> z Cay, Ci PP 
tea * 2 2 “- 
Gao, + Ce, 





2 e 
Ci%%,, * Se, 


The first of the two fractions in the rirht- 
hand member is the variance ratio. For a 
riven value of Yro,, the value of this ratio 
is a minimum when the second is unity, 
i.e., when o., = O and x, is a component of 
Xo. When , # O, i.@., when X, is not a 
component of x,, the value of the variance 
ratio is larger. It is equal to r,, when 
Oc, = Cide,- For other values of oe,, the 
value of the variance ratio can only be es- 
timated unless certain supplementary data 
are available.® For a given value of the 
variance ratio, the value of r,, decreases 
aS Oe, increases. 


1. The values of these uncorrelated variables were secured by counting the number of heads or tails resulting from 
tosses of collections of coins. For example, the values labeled A, are the counts of heads or tails resulting from 
tosses of thirty coins. As a means of eliminating the effect of sny imperfections in the coins, heads were counted 
for the first fifteen tosses, tails for the next fifteen, snd so on. For use in the caleulstions described later, 


counts of one hundred tosses of each collection were made. 


Table I gives only illustrative values. 


2. The reader's attention is called to the fact that the srguments relating to correletion enslysis cre in terms of 
deviation measures. The data derived by counting tosses of coins were not reduced to deviation measures because 
they are in terms of the same units and expressed from absolute zero points. Hence, the writers ere justified in 
using these raw measures as if they were deviation measures. Throughout the article, large letters will be employed 
to denote raw measures and small letters will be used for the deviation measures. 


iv] 
. 


Psychological Review, 56:425-24, September, 1929. 


For a description of the technique, see Tryon, R. C. "The Interpretation of the Coefficient of Correlation", 
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PARTIAL CORRELATION AS A WEANS OF 
ELIMINATING THE EFFECT OF FACTORS 
OF HETEROGENEITY 


The correlation hetween two variables 
; affected by the variability of the popu- 
lation with reference to other variables. 
‘or example, the correlation between test 
scores in two school subjects such as 
rithmetic and silent readine will vary 
yith the variability of the population rel- 
ative to intelligence test scores. Usually 
the correlation between two variables is 
‘alculated for a population that is hetero- 
reneous with reference to one or more other 
traits, and frequently the correlation is 
iesired for a population that is homogene- 
us in certain respects. The technique of 
partial correlation represented by the for- 
mula 
Tor ~ Toz Tie 
Poies * — — 


ae Re mm 
Vi = Toatl= Pis 


1s been proposed as a means of obtaining a 


coefficient of correlation for a homoreneous | 


population from the data collected from a 
population heterogeneous with reference to 

measured trait. It has been pointed out} 
that partial correlation may fail to yield 
this result. By employing variables whose 
values are sums of uncorrelated sub-varia- 
bles as illustrated in Table I, the desired 
correlation may be calculated directly as 
well as by means of partial correlation. 
Hence, we have a means of testing the oper- 
ation of this technique. The results ob- 
tained by means of the partial correlation 
formula and by direct calculation are given 
for several factor patterns. 


- Burks, B. S. 


Society = the Study of Education, Part I (Bloomington, Illinois: Public School Publishing Company, 1928, 
p. 12-13). 


See also: 
Burks, B. S. 


Walter S. Monroe and D. B. 
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| Illustration 1. 

Xo = A, + Ag + As + Ay * Ee 

X, 2@ A, + Ag + EB, 

| Xe = A, Toa-e -50 
X; = A, + As + EQ Toes = -55 


By direct calculation, the correlation 
between 


A. + As + A, + E, and A, + E, is .5( 
Illustration 2. 
X, = A, + A, + B, 
Xe he * hp * Be 
| Xo @ Ay + Ae * Be 
| By partial correlation formula 
Toa.g = eds 
| By direct calculation, the correlation 
| between A, + E, and A, + bk, is -.025. 
| Illustration 3 
bs | willg: Befhigu® Ago Ay Oe 
X, A, +A, + A, + E, 
X, = A, Toae2 = «400 
Xo @ Ag+ Ay + Ag * Es Tei.9 @ 666 
X, = A, + SA, + SA, + E, 1,,., = -594 
Illustration 4. 
Xo = A, + Az + EQ + E, 
X, = As + Ag —- Eg — Ey 
X2 = Eo + 4As Toi-2 = --201 
X, = Eo + Ay To..3 = 7.201 
| KX, @ Eo + Ag Ay Yeaug © ~22S4 
By direct calculation, the correlation 
between 


A, + A, + £, and A, + A, - E, 18 -.139 





| The explanation of these discrepancies 
| is apparent when the derivation of 


the 






"Stetistical Hazards in Nature-Nurture Investigations", Twenty-Seventh Yearbook of the National 


"On the Inadequacy of the Parti] end Multiple Correlstion Technique", Journal of Educational Psy- 
chology, 17:532-40, 625-30, November, December, 1926. 
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formula for partial correlation is examined. 
The partial coefficient, r.,.2, represents 
the correlation between? 

G6 


(X, bed Pozo, X2) 


and 


Oy 
(X, - Cie Op Xe) 


which are errors of estimate.* If X, is a 
component of Xo, that is, contributes it- 
self completely to X,, and X, and X, are 
expressed in terms of equivalent units, 


o 
Tos — becomes unity and the first error of 


2 
estimate can be expressed simply as X, - Xgz. 
Similarly, if X, is a component of X,, the 
second error of estimate can be written 
X, - Xe. In this case then, the partialling 
out of X, from X, and X, is a matter of sim 
ple subtraction, and the obtained coeffi- 
cient is a measure of the correlation be- 
tween the remainders or residuals. If the 
variable partialled out is not a component, 
the quantity subtracted is only a best es- 
timate of what it is desired to remove and 
the remainder is only an estimate of what 
the investigator is attempting to obtain. 

When the variables are defined as in 
Illustration 1, the value of roi.2 obtained 
by applying the partial correlation formula 
is the correlation between 


((A, + Ag + As + Ay + Eo) 


oO 
» Toe g (A, + As + Ee)) 


and 
0 
((A, + Ag + Ey) = Tig gt (Ar + As + Ea). 


In other words, all of the A, is not removed 


oO Go 
since the terms Tog g~ and Try, = are equal 
2 2 
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to unity only when X, contributes itself 
completely to Xo and X,. The result is 
that the obtained variables are not perfect- 
ly homogeneous with respect to A, and in ad- 
dition have been made heterogeneous with re- 
spect to E,. Hence, in employing the par- 
tial correlation technique, if the desired 
results are to be obtained, one must be 
certain that the variable partialed out 
consists only of one or more components 
which are known to be present in both X, 
and X,. If X, contains factors not included 
in X, and X,, only an approximate removal 
of X, is accomplished. 

The usual interpretation of a coeffi- 
cient of partial correlation assumes a fac- 
tor pattern of the type 


Xo = 4, + a, + a; 


Xi 


" 
@ 
~ 
+ 
rt) 
” 
+ 
@ 
rs 


X, = a, 


Usually x, is not a component of the other 
two variables and the calculated values of 
To2 and r,, are less than they would be if 
X2 were a component. Suppose r,, = .70, 
Toe = .50, Tr,, = .40 for a factor pattern 
of the type given above. Then fr... = .59 
is a measure of the actual net correlation 
between x, and x,, i.e., the residue of 
correlation after the effect of x, has been 
eliminated. If, however, x, includes a 
factor uncorrelated with x, and x,, the 
calculated values of r,, and r,, will be 
less than the corresponding coefficients 
for the component variable. These camot 
be calculated, but for purposes of illus- 
tration we may take ro, = .65 and 

Y,, = .60. For these values fo,.g = .5l. 
This result is indicative of the effect of 
an uncorrelated factor in the variable 
partialed out. The magnitude of the ef- 
fect upon the coefficient of partial cor- 
relation varies, and in the absence of 





1. This concept of partial correlation is found in the work of Yule who developed the technique. See: 
Yule, G. Udmy. An Introduction to the Theory of Statistics. London: Charles Griffin and Company Ltd., 1917, p. 256. 





For a more recent expression of the idea, see: 


Dunlap, J. W. and Cureton, E. E. "On the Analysis of Causation", Journal of Educational Psychology, 21:664-65, 


December, 1950. 





- Errors of estimate are those involved in estimating the values of one variable from those of another by the use of 


the regression equation. The word "residuals", employed by Yule, appears to be « more appropriate term for the 
above expressions, but "errors of estimate” is more generally employed at the present time. 
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information concerning the factor pattern 
involved, dependable estimates camnot be 
made. 

It is surprising that in spite of the 
fact that Yule in his early work made clear 
what partial correlation actually accom- 
plishes, many writers have made misleading 
statements concerning it and have employed 
it in situations in which the technique 
failed to accomplish the desired end, even 
thouch the writer thoucht it did so. Buck- 
ingham! makes the statement that the par- 
tial correlation coefficient measures’ the 
correlation between two things after the 
influence of others has been taken away. He 
then goes on with an example in which he 
obtains the correlation between teachers! 
salaries and teaching ability, holding con- 
stant or removing the effect of profession- 
al training. Dunlap and Cureton® state 
that the coefficient of partial correlation 
measures the correlation between that part 
of a variable which is uncorrelated with 
one or more others and that part of a sec- 
ond variable which is also uncorrelated 
with these others. Various other writers 
have made similar statements. If the va- 
riable partialed out is not a component, as 
is usually the case, the obtained coeffi- 
cient of partial correlation is a measure 
of the correlation between best estimates. 
Hence, an investigator should be cautious 
in the use of the technique of partial cor- 
relation and in interpreting results ob- 
tained by its use. 

It seems unlikely that partial corre- 
lation will accomplish the desired result 
when the variable partialed out consists of 
test scores or age scores derived from 
them. Such variables involve variable er- 
rors of measurement which will constitute 
an uncorrelated factor. The effect of this 
factor may be materially reduced, if not 
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eliminated, by making corrections for at- 
tenuation, but it is likely that test 
scores involve other uncorrelated factors. 
The statement has been made that it is dif- 
ficult to conceive of any variable other 
than chronological age that can be satis- 
factorily partialed out, and partial corre- 
lation will not yield the desired result 
in this case unless the relationship with 
the other variable is linear. 

Other systems of partial correlation 
have been proposed,® but their application 
is subject to similar limitations. 


CONTRIBUTIONS FROM TWO OR MORE MEASURED 
CAUSES: MULTIPLE REGRESSION AND PATH 
COEFFICIENTS 


It is frequently desired to secure 
measures of the contributions of two or 
more causes to a given effect or dependent 
variable. A basis for accomplishing this 
is an algebraic equation which expresses 
the dependent variable as a linear function 
of the independent variables given as 
causes. An approach to an understanding 
of the procedure may be made by consider- 
ing first the case of two uncorrelated 
causal variables. If x, = x, + X,, and x, 
and x, are uncorrelated, the determination 
of the contributions of each of the inde- 
pendent variables to the variance of the 
dependent variable is very simple.* Since 
the variables are expressed as deviations 
from their respective means, 


2. 2XS _ (x, + Xe). 
N 











Go, = 
N 
xi + 2x,x, + x4) 
N 
_ 22... ge 
- ne ee 





1. Buckingham, B. R. "Partial Correlation", Journal of Educational Research, 7:544-49, April, 1923. (An editoriel.) 





2. Dunlap and Cureton, op. cit., p. 665. 
5. Dunlap, J. W. and Cureton, E. E., op. cit., pp. 665-72. 


The origin of semi-partial correlation dates beck to the early work of Spearman. See: 
Spearman, C. "The Proof and Measurement of the Association between Two Things", American Journal of Psychology, 


15:94, 1904. 





The form of the forma for three variables which is given by Dunlap and Cureton was also proposed by Franzen. 


See: 


Franzen, Raymond. "A Comment on Partial Correlation", Journal of Educational Psychology, 19:194-97, March, 1928. 


4. In this very simple illustration, x, is assumed to be completely determined by x, and x,. 





Sidered to be in terms of equivalent units. The arguments here given may be extended to the case in which the de- 


pendent variable is the weighted sum of its components. 


Also, x, and x, are con- 
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Since x, and X, are uncorrelated, =X,X, = 0 
and we have 04 = oi + o2. Hence, the per 
cent of the variance of x, contributed by 


2 
o 
x, is given by the ratio —. 


02 
Similarly, —3 
Go So 


gives the per cent contributed by xz. The 


2 
0} 
value of the variance ratio —is equal to 


Co 


G 


Yo, and the variance ratio “is equal to 
9o 


Toe. 

This development may be extended to 
any number of uncorrelated component inde- 
pendent variables one of which may repre- 
sent unmeasured causes. In the typical 
case, however, the independent variables 
are correlated and are not components, i.e., 
do not contribute themselves completely to 
the dependent variable. This means. that 
the dependent variable cannot be precisely 
expressed as the weighted sum of the inde- 
pendent variables and unmeasured causes. 
Hence, the equation developed will be only 
an approximate expression of the relation- 
ship. 

The expression, 


Doa-a Xi ¥ Doe2-i Xe 


gives the best estimate of x, that can be 
obtained from the independent variables, x, 
and X2. Hence, we use 


as the best linear expression of the rela- 
tionship. In this equation bdo..2 and Doe., 
are the ordinary regression coefficients. 
In terms of the symbolism of the regression 
equations, u = Xo - Xo. This means that the 
term u will include errors due to the use | 
of the terms Doi.2 Xi and Do2.. X2 in addi- 
tion to unmeasured causes. In the develop- | 
| 


} 
| 
| 
Xo = Doar-2X, + Dog.3 X2 tU 
} 


ment which follows, u will be considered to 
be uncorrelated with x, and x,. When this 
is not true, an additional approximation is 
introduced. 
2 
6k Xo e E(Doa-2 Xatb 02-1 X, *u) = 2(bor-2 xi +b be 
N N 





N N y N 
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| equation disappear. 
| term may be written as follows: 


| contribution of x, and x, to Xo. 
| term, of represents the contribution of un- 


2 
| written 6,...- Likewise, bo2., 





Volume III, No. 3 


Since uv is assumed to be uncorrelated wit} 
either x, Or Xz, £X,uU and £x,u are equal t 
zero, the last two fractions in the above 
The remaining product 


2eres DoarEhXe xs 
N Doa.2 O2-1 No,02 


0102 


n 
2Dor.2 Doe.a 129192 


Thus we have 


> 
a 


o.= ® sisaes + b32.10% + 2Do01.201 Doe2.102 Pizt 
The term boi-2 0; represents the direct con- 
tribution of x,, the term bg..,02 represent: 
the direct contribution of x2, and the ter: 
2003-201 Doz-1 O2 Fiz 1S a measure of the joint 
The last 
measured causes and the errors due to the 
approximations introduced. The direct and 
joint contributions of x, and X2 as well a: 


| the contribution of u, may be expressed in 
| per cents by dividing both sides of the 
| equation by 0% 


o% of o2 
2 
1= 3 = Doa-2 + Dé2. 2 
05 1 o2 O2+1 os 
2 
O77 Oz Su 
‘Dns ten tat F 
O1+2 & 02-1 GS 12 & 


2 
The term bé,.2 } is the square of 
° 


the corresponding Beta coefficient of the 


| multiple regression equation and may be 


may be 


Saf 
ONIN N 


| written 62,.,, and the term representing 


the joint contribution of x, and x, may be 
written 2Bos.2 Boe-a Tieze Hence, the equa- 
tion may be written 


2 2 
1 = Bor. 2*Boe.. * 2Bor.2 Boa. Tie * o 


el Xe +2Do01.2 Dosa X,X 2" +2do1-2 XaUt2d o2-1 X2uU) 





N 





N N N 
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equation 
following 


may be expressed in terms of 
symbols. 


Gor-2 = Age. = 2Por-2 Poez-1 Tas * Any 


doi-2 Go2.. * G ois + dou 





| variables and the path coefficients 
ing them 


he term d,,.. is read the coefficient of 
iirect determination of x, with respect to 
X,, dog. 18 the coefficient of direct de- 
termination of x, with respect to x,, and 
is the coefficient of joint determina- 
ion of x, and x, with respect to x,. If 


4 


| termination d, 
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from 1 ton other than i. There will be as 
many terms of joint determination as there 
are possible pairs of independent variables. 
These coefficients of joint determination 
may be expressed in terms of the coefficient 
of correlation between the two independent 
connect=- 
with the dependent variable. For 
example, 


- = 
G oa-36 = 6Vi2 Dor-23...nPoa.as...n 


The values of the coefficients of de- 


3.0m = 4d oa.13,..0 ’ 


he term Gore is written 2po..8 Doo.: Mie >» G oig.a4...n » @tC., may be calculated from the 
SymbOl DPo..2 designates the path coef- data. The value of dg, is obtained by sub- 
Picient! connecting x, and x, and the sym- | tractine the sum of the known coefficients 
\1 Doeg-, designates the path coefficient of determination, 1.@., Goi.23...n » Woz-as...ny 
mnecting x, and x,. | doiaeas...n » @tC., from unity. 
The general equation in terms of coef- The coefficients of determination may 
icients of determination for n independent | be obtained from the beta coefficients of 
variables is: /the regression equation or they may be cal- 
| culated by means of Wright's method of path 
1 = dor.a3...n * Goeeas...n + ooo | coefficients. The fundamental theorem of 
| Wright's method® may be stated as follows: 
* don.r2...(n-1) + d oid-3...n | Given X,, the dependent variable, and x,, 
| Xp, Xs-.-X, independent variables, the co- 
+ dois.ea..n + ++ + Gory.( yi)... | efficient of correlation between x, and any 
of the independent variables or between any 
+ <&. 


two of the independent variables, is equal 
to the path coefficient connecting the two 
variables plus the sum of the products of 
the path coefficients alonr all  ? of 


In the next to the last term, i takes any 
value from 1 to n, and j takes any value 


- A path coefficient is defined as the ratio of that part of the standard deviation of a variable which is due to an- 
other variable to the total standard deviation of the variable. In other words, 4 path coefficient represents the 
ratio of the estimate of that part of the standard deviation of the dependent variable which is due to another va- 
riable to the total standard deviation of the dependent variable. Thus, po,.2 = bo,.203 as indicated above, and the 
term 6$,.2°% which was written as the Beta coefficient 65,.2 might also be written “o p%, 2; hence By, 2 = pox.a- 

_— 

For the development of path coefficients, see: 

Wright, Sewall. "Correlation and Causation", Journal of Agricultural Research, 20:557-85, Jenuary, 1921. 

Wright, Sewall. "The Theory of Path Coefficients", Genetics, 8:258-55, May, 1923. 

For further proof of the identity of path coefficients and Beta coefficients, see: 

Kelly, E. L. "The Relationship between the Techniques of Partial Correlation and Path Coefficients" 
Educational Psychology, 20:119-24, February, 1929. 

Dunlap, J. W. and Cureton, E. E. "On the Analysis of Causation" 
December, 1950. 

- For illustration of Wright's method, see: 

Burks, B. S. "The Relative Influence of Nature and Nurture Upon Mental Development; a Comparative Study of Foster 
Parent—-Foster Child Resemblance and True Parent—True Child Kesemblance", Twenty-Seventh Yearbook of the National 
Society for the Study of Education, Part I. Bloomington, Illinois: Public School Publishing Company, 1928, 

pp. 299-501. 

Heilman, J. D. "The Relative Influence Upon Educational Achievement of Some Hereditary and Environmental Factors", 
The Twenty-Seventh Yearbook of the National Society for the Study of Education, Part II. Bloomington, Illinois: 
Public School Publishing Company, 1928, pp. 35-65. For « more extended account, see: 

Heilman, J. D. "Factors Determining Achievement and Grade Location", Journal of Genetic Psychol 
September, 1929. 
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indirect connection, not including those 
through the dependent variable.’ 

In the case of three independent vari- 
ables, this theorem provides the basis for 
writing the following equations? 


Tor = Por* PazsPos* PsiPos* Piz Ps2 Pos *PisPasPoe 
Tor = Poa * PizPoit* Psa Post Psa Pais Pio* Pus Pai Pos 
Tos = Pos * Pas Poi* Pas Poa* Pas Pai Poz* Pes Piz Por 
Tie os Pia + Ps2 Pis 
Tis = Pis* Pas Piz 
Tas = Pas? Pis Par 

Although not attempted by Wright, Heil- 
man, or Burks, it is apparent that these 


equations may be simplified. Collecting 
terms in the first three equations, we have 


To. = Por * ((Pis * PisPas) Poa 
+ (pis + PisPas) Pos! 

Feo * Den * UBag © DasPrs) a! 

+ ((des * PagP so) Pos! 


Tes = Doo * ((Pisg * PasPrs) Poa] 


+ [(Das * PisPis) Pos! 


Tia = Pie * PasPas 


Tis = Pis * PasPise 


“9 
i) 
e 

" 


Pes * Pis Pies 


Hence the first three equations may be 
written as follows: 


Tor = Por * TizPoe2 * TisPos 


To2 = Por * TizPor * TasPos 


Tos Pos TisPor ° Tas Poe 
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Examination of these equations will indi- 
cate how the corresponding equations for 
any number of variables may be written. 

The above equations are the "normal 
equations" from which regression coeffi- 
cients may be calculated.> The schema for- 
mulated by Griffin* afford an economical 
method. The coefficient of multiple corre- 
lation affords a means of checking the cal- 
culations involved in securing the values 
of the coefficients of determination. The 
square of the coefficient of multiple cor- 
relation R o-12...n 18 equal to the sum of 
the coefficients of determination, exclu- 
sive of do,. Im other words, Ro.ze...n 
=1l-d),. Since Ro.12.... 1S obtained 
through a different set of calculations, a 
check between Rowus...n and the sum of the 
coefficients of determination is a good in- 
dication that the calculations are without 
error. 

The application of the method defined 
by the coefficients of determination may be 
illustrated by taking variables constructed 
from counts of coin tosses as follows: 


Xo =A, + Ag + Ay + A, + Ey 
X, =A, +A, +E, 
X, = A, + A, + EB, 
X, =A, + Ay + Es 


The structure of these variables is prob- 
ably not greatly different from that which 
we might have for a situation in which the 
contributions of certain factors to 
achievement in a subject such as chemistry 
is being studied. In this set-up, X, may 
be thought of as the dependent variable 
representing scores on an achievement test 
in chemistry, and X,, X,, and X, may be 
thought of as measures of abilities which 
contribute to achievement in chemistry. The 
component A, may be thought of as a 





ane 


tistical Association, 18:995-1005, December, 1925. 
Barrett, H. E. 


of Educational Psychology, 19:45-49, January, 1928. 





. For proof of this theorem, see the references to Wright's work. 
. The reader should note that the subscripts of the path coefficients have been simplified. 
. Tolley, H. R. and Ezekiel, M. J. B. "A Method of Handling Multiple Correlation Problems", Journal of American Ste- 


"A Modification of Tolley and Ezekiel's Method of Handling Multiple Correlation Problems", Journal 


4. Griffin, H. D. "Simplified Schemas for Multiple Linear Correlation", Journal of Experimental Education, 1:259-54, 
March, 1955. P 
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"ceneral factor"; Az, As, and A, may be 
thought of as factors unique to each abil- 
ity; Eo, Ei, Ee, and E,; may be thought of 
as representing variable errors of measure- 
ment and validity. 

Applying the path coefficient technique 
to this problem, the following values were 
obtained for the coefficients of determina- 
tion. 





do. = 21901 
doz = .1116 
dos = .0400 
dor = .1044 
dois = .0682 
d ozs 0 = ~0536 
20679 


Subtracting the sum of the coefficients of 
determination from 1.00, we have dp. = .4321. 
Ry the above definition of X,, the unmeas- 
ured cause is represented by E,. Direct 
calculation gives the value, dog,= .1000. 
This means that the obtained values of the 
coefficients of determination are too small. 
Instead of their sum being only .5679, it 
should be .9000. The attenuating effect is 
due to the fact that the use of the regres- 
sion equation, when the independent vari- 
ables are not components, results in only 
an approximate expression of the existing 
relationship. The calculated values of the 
coefficients of determination may indicate 
the relative order of magnitude of the con- 
tributions, but the fact that they are too 
small is a rather serious limitation of the 
technique. 

Even if the coefficients of determina- 
tion were not attenuated, there would still 
remain a serious difficulty of interpreta- 
tion. For example, if X, and X, represent 
scores on an intelligence test and a silent 
reading test, the total coefficient of de- 
termination'for X, camot be interpreted as 
a measure of the contribution from general 
intelligence because obviously the obtained 
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measures of this trait and those of silent 
reading ability have much in common. Coef- 
ficients of determination for the factor 
common to X, and X,, the component of X, 
uncorrelated with X,, and the component of 
X2 uncorrelated with X, would be more mean- 
ingful statistics. These factors may be 
described as defining causes that are ele- 
mental* with respect to the two given vari-* 
ables. 


FACTOR ANALYSIS 


If several correlated variables are 
being considered torether, it is generally 
agreed that four types of factors can be 
hypothesized, i.e., general factors, group 
factors, specific factors, and chance fac- 
tors. A general factor is one that is 
present in all the variables which are be- 
ing considered together and a group factor 
is one which is present in two or more va- 
riables. Specific factors and chance fac- 
tors are unique to each particular variable. 
Let @,, 4g, 4, represent uncorrelated re- 
mote causes of X,, S, a specific factor, 
and e, a chance factor. Assuming that X, 
is a linear function of its causes, it may 
be expressed, 


Xo = CoiG, * CopAeg F CozgGy * CogSq * ConO, 


If the a's account for the correlations of 
X, with the independent variables x,, x,, 
X3, X,, and x,, they may be described as 
the elemental (remote) causes of the inde- 
pendent variables as well as of the de- 
pendent variable and the contributions of 
the independent variables to the dependent 
variable may be thought of as being made 
through these remote causes. In such a 
case each independent variable would be ex- 
pressed as a linear function of one or more 
of the remote causes, a specific factor, 
and a chance factor. 





l. The total coefficient of determination for X, will consist of the coefficient of direct determination for this va- 


riable plus a portion of the coefficient of joint determination. 


In the absence of a better method, a coefficient 


of joint determination has been divided in proportion to the coefficients of direct determination of the two vari- 


ables. 


- In explaining the general problem for which he developed the path coefficient technique, Wright introduced such a 


group of causes which he designed as "remote", but he does not give any technique for identifying and measuring 


their contributions. 
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Xa = Cy,4, * Cy2G@_q * CasG3 + Cag Si * Cis Oi 





Xe = C214, + Caz242 + Cassy + Cage + Cas C2 





Xs = C3, 4, * Cyg4_g * Cy343 + Cy483 * Cy5 65 









Xq = Cg 4, F Cage + CagGz F CagSy + C5 Oy 








Xs = Cy, 4a + CygGy + Cy3 Gy + CagS5 + Cog Os 








If a factor is not a component of a 
particular independent variable, the term 
corresponding to it in the above equations 
would be equal to zero. For example, if a 
: were not present in x, and x,, the terms 
: Cp,8, and c,,a, would drop out of the equa- 
{ tions written for x, and x,. In such a 
j situation we wovld say that a, and a, were 
7 general factors, being present in all the 

variables, and a, would be called a group 
factor, being present in only three of the 
variables. 
The contribution of each of the remote 
ft causes to x, may be expressed in terms of a 
; 
; 
? 




































variance ratio. For example, the contribu- 
tion of a, to x, is expressed by the vari- 
ance ratio 












If the ats, s, and e, are expressed in terms 
' of standard units, we may write 
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2 — ad 
Ss, - Se, a 1 


_s = 
Ca, Oa, = Sa, * 


If the c's are chosen so that & = 1, the 
variance ratios reduce to squares of the 
c's and we have, 





? 2 


2 2 2 2 2 
Cor > Coa + Cos ” Cos ° Cos 


The problem of determining the contri- 
butions from the remote or elemental causes 
is one of determining the c's in the expres- 
sion for x, which are called factor load- 
ings. As a basis for this determination, 
we may write equations similar to the above 
for each of the independent variables. The 
correlation between each of the variables 
in the above set of equations may be ex- 
pressed in terms of c's. For example, the 
correlation or communality between x, and 
xX, may be written, 





Toa = Coy Cry ad Coz Cis + Cos Cis 

The coefficient of reliability is equal to 
1.00 minus the square of the factor loading 
of the corresponding chance factor. For 


example, 
028 2 2 2 
Too = 1.00 = Cos = Cor + Coz * Cos + Cos 


When the number of equations thus 








Ll. The proof is as follows: 
The coefficient of correlation rg, may be expressed, 




















Substituting the values for xo and x, we have 








Pe XXoX} 
To. = Noo, 
Since Xo and x, are expressed in terms of standard units, 09 = 0, = 1, and we can write, 
a ExXoxX, 
To = 


_ 2(Cor8, + Cozae + Cos®3 + Cog8o + Cos@o) 








Toi = 














vi 


(Cy38, + Cyg&g + C383 + Cy45, + Cy 5e,) 












Yai 





To. 








2 
The terms =, 





















Ji 


Multiplying these two terms, all the resulting products involving uncorrelated components are equal to zero, hence, 
2 


2 2 
La p » 
—2, and = are, of course, nothing more than the respective standard deviations a 


oa,* Cn, = Oa; = 1, since a,, &g, &3 are expressed in terms of standard units. 


Toa = Coafaa + Co2Ciz + Cos%is 


Zaz La; 


= Coq) a + Soxtia + Sos%13s 


2 2 
a, Sag» Say+ But 
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formed is equal to the number of c's, it is 
theoretically possible to determine their 
values. However, the labor of solving a 
large number of simultaneous quadratic 
equations prohibits a direct attack as a 
feasible procedure. Kelley+ has proposed a 
method of successive approximations based 
upon least squares, but Holzinger® has 
shown that several different solutions may 
be fitted to Kelley's data. Recently 
Thurstone® has developed a technique which 
rives a unique solution in the case of cer- 
tain factor patterns. 

Factor analysis appears to offer a 
means of determining the contributions to a 
civen dependent variable from the causes 
that are elemental with reference to the 
croup of independent variables. It is not 
necessary to demonstrate that the given in- 
dependent variables are causally related to 
the variable designated as dependent. The 
a's are causes and when all measures are 
expressed in terms of standard units, the 
squares of the factor loadings of the de- 
pendent variable will measure the contribu- 
tions of the elemental components. It 
should be noted, however, that the con- 
tributions measured are those of the ele- 
mental causes to the variance of the de- 
pendent variable and not to measures of the 
dependent variable. In other words factor 


enalysis will not yield measures of the 
contributions to measures of achievement 
or of other traits. 
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The principal conclusions of this ar- 
ticle may be stated formally as follows: 

1. If a variable is known to be @ com 
ponent of another variable, the contribu- 
tion of the first or of a common cause per- 
fectly correlated with it to the variance 
of the second is measured by the square of 
the coefficient of correlation between the 
two variables. 

2. The contributions of a third corre- 
lated variable may be partialed out when it 
is a component of the other two. When it 
is not a component, application of partial 
correlation yields only an estimate of the 
net correlation between the two variables. 
This estimate tends to be numerically larg- 
er than the true net correlation. 

3. The measures of the contributions 
of independent variables obtained from the 
beta coefficients of a multiple regression 
equation or by means of Wright's path co- 
efficient technique are attenuated esti- 
mates. Hence, these methods are not satis- 
factory for studying the contributions of 
independent variables to a dependent vari- 
able. 

4. Factor analysis appears to afford 
a means for securing measures of the con- 
tributions from elemental causes, but the 
method proposed by Kelley and that first 
proposed by Thurstone are not adequate. 
Thurstone's method for a unique solution 
appears to be satisfactory, but its appli- 
cation is dependent upon the existence of 





a certain factor pattern. 
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Another reference dealing with the same problem is Hotelling, Harold. 


l. Kelley, T. L. Crossroads in the Mind of Man. Stanford University, California: Stanford University Press, 1928, 
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THE CORRELATION COEFFFICIENT AS AN INDEX OF RELATIONSHIP 
by 
Paul Hanly Furfey 
and 
Joseph F. Daly 
The Catholic University of America 


During the last few decades the 
technique of correlation has been in- 
creasingly used in the social sciences 
to measure what is called the relation be- 
tween variables. Thus it is usual 
to say that if two variables yield a 
correlation coefficient of +0.95 they 
are “closely related." If the coeffi- 
cient is near zero, we say they are "un- 
related." We use such language with 
more confidence if the coefficients in 
question are product-moment coefficients 
or ones considered equivalent to such 
coefficients. But the concept of meas- 
uring relationship is quite general 
and is used to some extent in regard to a 
variety of different techniques. 

In view of the wide use of correlation 
and the considerable amount of mathematical 
discussion which it has evoked, it seems 
strange that there has been very little dis- 
cussion of the meaning of the words rela- 
tion and closeness of relationship as used 
in this connection. The present article is 
offered as a contribution to this problem. 
In what sense can we say that correlation 
measures relationship? 

It is hardly necessary to remark that 
correlation does not measure causal relation 
ship. For example, if it is shown that 
there is a high correlation between annual 
marriage rates and some index of general 
business condition, then the said correla- 
tion does not prove that fluctuations in 
business cause changes in marriage rates, 
nor that marriage rates change the condi- 
tion of business, nor that both variables 
depend on some third factor. These ques- 
tions are to be decided from considerations 
quite apart from the mathematics of correla- 
tion. Mathematics cannot measure causality. 








At most it can tell us something about con- 
comitant variation. Since the term rela- 
tionship may suggest the existence of caus- 
al relationship, it would probably be bet- 
ter to discard it altogether and use the 
less ambiguous term concomitant variation. 
Ye shall not, however, endeavor to intro- 
duce this somewhat awkward term but shall 
continue in this article to use the terms 
relation and relationship, bearing in mind 
the restrictions discussed in this para- 
graph. 

Correlational analysis, then, is con- 
cerned with concomitant variation. Before 
pursuing this subject further, it is worth 
noting that the use of correlation implies 
a type of problem quite strikingly different 
from that usual in the physical sciences. In 
the latter, relationship is simply consid- 
ered to be present or absent. Generally 
speaking, the physical scientist refuses to 
recognize intermediate degrees of relation- 
ship. If the points representing the paired 
values of the two variables fall along some 
reasonably simple mathematical curve, then 
he considers that the variables are related. 
If the points are so scattered that such a 
curve cannot be drawn, he ordinarily sus- 
pends judgment. 

Not so in the social sciences! Here 
the extremely simple type of relationship 
which may be expressed by a mathematical 
curve occurs but rarely. The social scien- 
tist therefore introduces the new concept 
of closeness of relationship, that is, the 











degree to which the bivariate distribution 
in question approximates the perfect type 

of relationship expressible by some rela- 
tively simple mathematical function. To 
characterize this degree of approximation in 
a quantitative manner is tae function of 
correlation. 








nt 
In 


ed 
me 


March, 1935 


It cannot be too strongly emphasized 
that correlation is an arbitrary process. 
We are not forced by the nature of things 
to adept any particular definition for the 
concept closeness of relationship. We are 
quite free to choose our own definition. As 
a matter of fact a great many different 
jefinitions have been suggested, since each 
of the many suggested ways of measuring re- 
lationship implies its own definition. We 
cannot say that one of these is right and 
the others wrong. We can merely say that 
one is more convenient to calculate than 
the others, or more useful in solving cer- 
tain problems, or easier to interpret. 

We have said above that correlation es- 
sentially is an effort to measure the de- 
“ree to which a given bivariate distribution 
departs from the perfect type of relation- 
ship expressible by some simple mathemati- 
cal curve. It is evident, therefore, that 
the correlation technique must involve two 
choices: (1) the choice of some mathemat- 
ical function as a standard of perfect re- 
lationship, and (2) the choice of some nu- 
merical measure of the degree to which the 
civen bivariate distribution departs from 
this standard. 

We might discuss the significance of 
the above two choices in regard to any of 
the numerous devices for measuring correla- 
tion which have been proposed at various 
times. For the sake of brevity, however, 
we shall confine ourselves to the discus- 
sion of the product-moment correlation co- 
efficient (r) and the correlation ratio 
(eta). Since these two measures of rela- 
tionship are usually considered the best 
available, any criticisms we make of them 
may be expected to apply with still more 
force to the other and admittedly inferior 
measures. 

The standard of perfect relationship 
in the case of eta requires that all the 
points in the scatter diagram shall fall on 
a curve expressible as a mathematical func- 
tion which is single-valued in respect to 
both variables. In the case of r all the 
points must fall on a straight line if the 
relationship is to be considered perfect. 
As far as their standard of perfect rela- 
tionship is concerned, r may evidently be 
considered a special case of eta. For this 
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reason we shall refer to the two methods 
collectively as the r-eta technique. 

The standard of perfect relationship, 
as defined in the r-eta technique, is not 
entirely arbitrary. At least it is "natu- 
ral" in the sense that relationship thus 
defined as perfect is equivalent to rela- 
tionship as it is defined in the physical 
sciences by means of mathematical equations. 
But the second element in the r-eta tech- 
nique, namely, the measurement of the de- 
gree of departure from this perfect rela- 
tionship, is arbitrary to a considerably 
greater extent. The words, "closeness of 
relationship" have no meaning in the Eng- 
lish language which is so definite that 
that meaning can be converted into mathe- 
matical terms in the sense that, for exam- 
ple, the word, "velocity" can be so con- 
verted. Closeness of relationship is itself 
defined by the equations which define r and 
eta and we cannot quarrel with that defini- 
tion. Just so every other measure of cor- 
relation, say Spearman's foot rule, defines 
some sort of "closeness of relationship" 
and we cannot say that one definition is 
better than another, except in the sense 
that one definition may be more useful. The 
only valid test of a definition is the 
pragmatic test. 

We can make these definitions more 
concrete in our own minds by interpreting 
them in various ways. Thus we may, if we 
wish, interpret the closeness of the rela- 
tionship between x and y,whenr is less than 
unity in absolute value, by looking upon x 
as the sum of two components of which one 
is perfectly correlated with y, while the 
other has zero correlation with y. Or we 
may interpret r as the slope of a regres- 
sion line. Or we may interpret either r or 
eta as a function of the amount of scatter 
around the regression lines or regression 
curves. These interpretations do not make 
the definition of closeness of relationship 
any more "natural" but they my make the 
definition more useful and thus constitute 
an argument in favor of the use of the r- 
eta technique. 

To be useful a definition must be un- 
ambiguous. We shall now proceed to criti- 
cize the r-eta definition of "closeness of 
relationship" on the ground that it yields 
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different numerical values for the same bi- 
variate distribution. If we can establish 
this fact, then the definition is ambiguous; 
it is imperfectly useful; and it may proper- 
ly be considered a poor definition. 

The r-eta technique measures relation- 
ship by measuring departure from a standard 
of perfect relationship. This standard is 
a mathematical curve. If the relationship 
is imperfect the r-eta technique provides 
no unambiguous way for discovering this 
curve. We are left to choose it ourself. 
The value of the numerical measure of close- 
ness of relationship depends on this choice. 
Therefore the r-eta definition of closeness 
of relationship is ambiguous. 

Let us try to mke this concrete. We 
shall discuss, therefore, the difficulties 
of measuring the relation of y to x by 
means of the regression curve of y on x. Of 
course in any practical case there is the 
added question of the regression of x on y. 
Let the scattergram be divided into colwms 
by lines parallel to the axis of y. Let 
the means of these columns be computed and 
let these meens be connected by some func- 
tion y = f(x) which passes through all of 
them. Let us suppose that this function is 
a straight line. The regression is said to 
be linear and r is considered the appropri- 
ate measure of relationship to be used. 

This is an ideal case which seldom or 
ever occurs in actual practice. But even 
this ideal case is subject to a certain am- 
biguity. For we are interested usually, not 
in measuring closeness of relationship in 
the particular sample at hand, but rather 
in measuring it in the bivariate universe 
from which this sample was drawn. Even if 
we decide that it is appropriate to treat 
this sample as linear, we have no assurance 
that the corresponding regression in the bi- 
variate universe is also linear. 

The force of this objection is clearer 
when we consider the vastly more common 
case when the function which passes through 
the means of the columns is some non-linear 
function, say, Y = F(x). Here there arises 
the question whether to use this particular 
function as a basis for calculating eta, or 
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to divide the scattergram into another set 
of columns and obtain probably a different 
value for eta or to use r as 4 mtter of 
convenience as though the regression were 
linear. 

One might adopt as a criterion the 
principle that we should choose the function 
which most closely approximates the corre- 
sponding regression function in the bivari- 
ate universe from which this sample was 
drawn. Most discussions are based on this 
principle; but the hopelessness of the sit- 
uation is evident. It involves arguing from 
the shape of this sample to the shape of 


| the universe from which it was drawn and 


thence back to this sample--a vicious cir- 
Cle. For example, we may calculate Zeta, 
and compare it with its o. This procedure 
gives us the probability that a sample 
drawn at random from a linear universe 
should depart from linearity as widely as 
this sample. But this is not the same 
thing as the probability that this sample 
was drawn from a linear universe. The evi- 


| dent distinction between these two probabil- 


ities is seldom recognized by statisticians. 
Whenever we endeavor to measure close- 
ness of relationship between two variables 
in a statistical sample we are faced with 
the two questions: Shall we treat this 
sample as linear or non-linear? If it is 
non-linear, which of the various possible 
regression functions shall we choose? Ac- 
cording as we choose different answers to 
these questions we shall obtain different 
numerical measures of the existing close- 
ness of relationship. There is no valid 
criterion to help us decide which of these 
values should be considered the correct one. 
Therefore closeness of relationship as de- 
fined by the r-eta technique is essential- 
ly ambiguous; and the r-eta definition of 
"closeness of relationship" is a poor one. 
In actual practice most statisticians 
cut the Gordian knot by treating all re- 
gressions as though they were linear. This 
practice is, of course, condemed by a11l 
writers on statistics. It has, however, 
the advantage of providing an unambiguous 
definition for closeness of relationship. 





1. Furfey, Paul Hanly and Daly, Joseph F.: "Product-moment Correlation as a Research Technique” Forthcoming in the 
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put the disadvantages involved more than 
counterbalance this advantage. For the in- 
jiscriminate use of r can even lead to the 
finding of zero relationship in distribu- 
tions where the relationship is expressible 
by a simple mathematical function and is 
tnerefore perfect in the sense of the phys- 
icist. 

More careful statisticians adopt some 
compromise. For example, they may decide 
to use some such test of linearity as the 
zeta test and to choose eta whenever the 
ratio between zeta and its o exceeds a cer- 
tain value, and r otherwise. Of course this 
joes not remove the ambiguity as far as eta 
is concerned, but it provides through r an 
unambiguous definition for closeness of re- 
lationship in a considerable number of 
cases. 

The procedure mentioned in the last 
paragraph, however, is subject to certain 
considerable disadvantages. The definite- 
ness of definition is obtained by arbitra- 
rily lumping together various kinds of bi- 
variate distributions which might profit- 
ably have been defined as representing dif- 
ferent degrees of closeness of relationship. 
Just so, the entomologist might obtain un- 
ambiguity of definition by lumping all but- 
terflies together as one species. Again, 
this wide use of r where the regression is 
not strictly linear destroys such useful 
interpretations of closeness of relation- 
ship as that which interprets closeness in 
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terms of the scatter around the regression 
line. In other words, although this usage 
cannot be criticized as ambiguous in the 
sense of yielding different values for the 
same distribution, it can be called ambigu- 
ous in that it yields the same value for 
distributions which might usefully be con- 
sidered as representing different degrees 
of relationship. 

It will be seen from these considera- 
tions that the r-eta technique is subject 
to disadvantages of a serious character. 
Each bivariate distribution can be made to 
yield one r and several etas. In general 
these various numerical measures of close- 
ness of relationship will not be equal. We 
are therefore faced with the necessity of 
choosing between them. If we make our 
choice on the principle that the regression 
chosen should conform to the existing bi- 
variate distribution in some reasonable way, 
then we must make our choice between r and 
the various etas on a non-mathematical basis 
and the measure of closeness of relation- 
ship ceases to be truly quantitative. If, 
on the other hand, we arbitrarily choose to 
use r, even when the distribution is not 
surely linear, then we secure a definite 
numerical value for our coefficient, but 
sacrifice a large part of its meaning as a 
measure of relationship. 
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TABLES FOR FINDING THE PARTIAL COEFFICIENT OF CORRELATION 
by 
William Dowell Baten 
University of Michigan 


The object of this article is to pre- 
sent tables for finding the partial coeffi- 
cient of correlation, 


Tie ~ Tis + Tas 
(2 Loltee® Fs “— 
a Vv i-r,; )(1-r,, ) 








between x, and x, with x, held constant. 

The quantities r,, and r,, may be inter- 
changed in the formula without affecting 
the value of the the partial coefficient. 
For a mathematical treatment of partial cor- 
relation see Camp! and Rietz.* 

The following tables contain values of 
Tiz.3 for values of the total correlation co- 
efficients for every one-tenth, that is r,, 
has values 0,.1,.2,...,.9,1.0 while r,, and 
Tas have values equal to .0,.1,.2,...,.9 re- 
spectively. The following example will il- 
lustrate the meaning of the formula and al- 
so show how to use the tables. 

The correlation coefficient between the 
weights and chest measurements of men who 
are twenty years of age is r,,= .8; the cor- 
relation between weights and heights is 
r,,;= .5, while the correlation coefficient 
between chest measurements and heights is 
Tz, = .6. The partial coefficient of corre- 
lation between weights and chest measure- 
ments with height held constant is accord- 
‘ng to Table I 


Tageg © of be 


To find this, find .5 in the colum 
headed r,, and then locate in the colum 
headed rz, the number .6 between .5 and .6 
for r,,;; now go down the column r,,= .8. 
This is the sixth column in 





Table I. Go down this column to the row 


Tis» Tas= -5, -6. This gives r,,., = .72. 

Since r,, and r,, can be interchanged 
in formula (1), this same value forr,,., is 
obtained when r,,= .6, and r.,= .5. The 
value for r,,., is found in the same place 
as before. 

Suppose we know that r,, = .4, r,, = .2 
and r,,= .7, and wish to findr,, ,. Go to 
column for r,,= .4 and go down this colum 
to row r,, , Tes; = 2, 7. This gives .42 
Lor Tas. 5 

If r,, andr,, have the same sign the 
partial coefficient of correlation can be 
also found from Table I, provided r,, is 
positive. Suppose r,, = .5, r,,= -.3 and 
To.,3= -.9. Pay no heed to the siens of r,, 
and r,,- Go down the colum for r,,= .5 to 
row f,,, 23> -5, -9; this gives .52 for 
liza ° 

Suppose r,, = .7, Ty, = .8 and r,, = .2. 
Interchange the values of r,, and r,, and 
do as before. This gives for the partial 
coefficient the value .92. 

Table I can be employed when r,, and 
2s are unlike in sign provided r,, is 
negative. In this case the sign of r,,., is 
opposite to that found in the table. Sup- 


| pose r,,= -.8, r,, = +.2, and r,,= -.3. Pay 


no heed to the signs. Go down the colum 
for r,,= +.8 to the rowr,, , T,,= 2, 3; 
this gives +.79 for r,,.,. This value must 
be changed to -.79. 

Consider the case when r,, = -.3, r,, 
= -.6, and r,,= .9. Go down the column for 
Tie = +.3 to the row for r,,, T,,= +.6, +.9; 
this gives -.69 for r,,.,. By changing the 
Sign the real value of r,,., is +.69. 

When r,, andr,, are unlike in sign 
and r,, is positive then Table II must be 
used for finding values forr,,.,. Assume 
that r,,= .3, T,,= -4 andr,,= -.6. In 
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‘1. Camp, B. H. "The Mathematical Part of Elementary Statistics", pp. 341-542. 
2. Rietz, H. L. "Mathematical Statistics", pp. 98-101. 
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Table II go down the column for r,,= .3 to 
the row for r,,, T.,= +.4, +.6; this gives 
.74 for Tyz., - This is as it should be for 
when T,, and r,, are unlike in sign and r,, 
{s positive the numerator in (1) is posi- 
tive, while the denominator is always posi- 
tive. 

If ry, and r,, have the same sign while 
ry. is negative the value of r,,., is nega- 
tive and a negative sign must be placed be- 
fore the value found in Table II. For exam- 
ple r,, = -.6, r,,= .3 and r,,= .5. Go down 
the column for r,,= +.6 to the row forr,,, 
r.,= «3, -5; this gives +.91. But since we 
know the value for r,,,, is negative a 
minus sign must be placed before this value. 
Thus the correct value is r,,, = -.91. 

These tables can be used to find the 
partial correlation coefficient, 


Ties ~ Tig.s *T.s 





¥ 





12.34 ’ 


Vk I-r..3. ) (Its. ) 


between xX, and x, when x, and x, are held 
constant. 

If Tis» Tiss Tig » Tas» Tags andr,, 
are known, then r,..,, can be found from Ta- 
bles I and II. First it is necessary to find 
Ties » Taga» And r.,.,. After these coef- 
ficients of the second order have been 
found they can be used as the total corre- 
lation coefficients were used before. 

Consider the example: 

NT 
T, 
Tha 
Qs 
Ths 


Tyeight height = *>+ 
Tweight rt. thigh = 8. 
Tt. thigh height = 7*2: 
Tyeight chest meas.” 7. 
Trt. thigh chest meas.~ °*: 


Tohest meas. height =.6. 
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Find the value of 


Tia.ss * T weight rt. thigh - chest meas. height. 


From the tables r,,., = .79 = .8, Ty,., = -14 
= .1, Peg.3 = *-05+ = +.1. Now use r,,., 88 
Tis» Tig.s 287, andr,,., as r,, and look 
in the tables. This gives .9 for r,,.,,- 

It must be remembered that when the 
total coefficients of correlation are ex- 
act numbers the tables give results correct 
to two significant figures but when the to- 
tal correlation coefficients are exact for 
one significant figure the tables are cor- 
rect for only one figure. Other tables are 
being prepared whereby results can be read 
to two significant figures, in the case un- 
der consideration to two decimal places. 
This longer table will of course give more 
accurate results yet the tables presented 
here can be used for rough work and will 
give a very good idea concerning the size 
of the various correlation coefficients. 

Partial coefficients of correlation of 
higher order may also be obtained from 
these tables. For example, the partial 
coefficient, 


Tia.se...(n-1)-Tin-34... (mel)? an-a0. --(m-]) 





Ple.se...n = “ — 


Ro ¢ a i, 
(1 - Tan.ae...(m1) [1 - Ten.ae...( n-1) ] 


may be obtained from these tables by 
building up the various partial coeffi- 
cients of correlation of lower orders. 
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TABLE I 


TABLE FOR FINDING THE PARTIAL COEFFICIENT OF CORRELATION WHEN 


Pm 
\@) Ti2 


IS POSITIVE AND ray AND rg3 ARE ALIKE IN SIGN, (b) ry, is 
NEGATIVE AND r,; AND rz, ARE UNLIKE IN SIGN 


The value of r,,,, when Ty2 is equal to 





The value of r,g,3 when r,, is equal to 





al 2 3 4 5 .6 s? 8 -9 1.00 rep r,, 
.10 2 -30 -40 -50 -60 .70 -80 -90 1.00 .0 
.10 2 .30 -40 .50 -60 .70 .80 -91 eS 
10 -20 31 41 -51 -61 oT -82 -92 2 
.10 -21 31 .42 .52 .63 .72 .84 .94 3 
11 22 .33 .44 .55- .65+ .76 .87 .98 aa 
2 23 .35- .46 -58 .69 .81 .92 5 
3 2 -38 -50 -63 75 .88 1.00 6 
.14 .28 42 .56 -70 .84 .98 7 
17 .33 -50 .67 83 8 
22 .46 .69 -92 9 
.09 .19 -29 .39 .49 .60 -70 .80 -90 1.00 “21 
.08 .18 .29 .39 .49 .60 7 -80 -90 £ 
.07 .18 -28 .39 .49 .60 a 81 -92 3 
.07 .18 .28 .39 -50 61 72 .83 .94 4 
.06 ol? 2 .41 -52 .64 . 75+ .87 -99 5 
05+ .18 .30 .43 .55+ .68 .80 .93 6 
.04 .18 .32 .46 .61 s75— .80 4 
-03 2 37 .54 7 87 8 
.02 .25+ .48 TL .95- 9 
.06 17 o27 .38 .48 .58 .69 A .90 1.00 .2 
.04 15 -26 -26 47 .58 .68 -79 -90 2 
.02 vie -25- .36 .47 .58 .69 .80 -90 4 
.00 12 24 . 35+ .47 .59 71 82 .94 5 
-.03 a i 26 .40 .54 .69 83 .97 6 
-.07 .10 o27 .42 61 .78 . 95+ 7 
=<0 .07 -24 -41 5 -75- .92 8 
=-.19 .05- 2 .52 . = .98 9 

s02 12 -23 34 45 .56 .67 7 .89 i008 “8 2 
—. 0% .09 .21 32 42 -55- .66 .78 .89 4 
~.06 .06 -28 -30 .42 .54 -67 -79 -91 5 
-.10 .03 16 29 .42 -55 68 81 94 6 
-.16 -O1 12 .28 243 .57 «72 .87 o? 
—.2 .07 -10 -28 .45+ .63 -80 .98 8 
-.41 617 .07 31 52 .79 9 

07 .05- Pe ly -29 .40 .52 -64 -76 .88 1.00 .4 4 
=.18 .00 oid .25+ .38 -50 .63 .76 .88 5 
~.19 .O5+ .08 .22 . 35+ .49 .63 .76 -90 .6 
-.27 .12 -03 .18 .34 .49 .64 .79 -95- aT 
-.40 .22 —.04 «18~ 33 -51 .69 .87 8 
~.65+ .40 -.15 -10 36 -60 - 85+ 9 

-.20 .07 .07 -20 .33 .48 .60 73 .87 i CCS 
-.29 14 .00 14 -29 43 .58 «98 .87 6 
-.40 24 -.08 .08 24 -40 .57 a0 .89 7 
-.58 .39 ~s19 .00 .19 .39 .58 oT .96 8 
~.93 -66 -.40 ~i8 .13 -40 .66 -93 9 

-.41 .25 ~.09 .06 ~22 .38 .53 -69 .84 1.00 .6 6 
-.56 .39 =.2) .04 .14 .32 .49 -67 .84 a? 
-.79 .58 -.38 a -.04 25+ .46 .67 .86 -8 
.98 -.69 -.40 «ii o5t .46 ~15— -9 

-.76 .57 ~.37 ~.18 -02 -22 -41 61 .80 1.00 7 7 
.84 -.61 ~.57 -.14 .09 .33 .56 .79 -8 
=~. 74 -.42 ~o80 ~22 55-  .67 9 

-~.94 ~.67 -.39 me Pe i .44 72 1.00 .8 £ 
-—.84 -.46 -.08 31 .69 9 

-.58 .05+ .47 1.00 .9 a 

wad ai 3 -4 5 6 of 8 9 1.00 ,Ts3 Tas 
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TABLE II 
TABLE FOR FINDING THE PARTIAL COEFFICIENT OF CORRELATION WHEN 


(a) mn, IS POSITIVE AND r, 5 AND res ARE UNLIKE IN SIGN; (b) ry, IS 
NEGATIVE AND r,s AND ms ARE ALIKE IN SIGN 


The value of ryzg.3 when rye is equal to 





-4 -6 7 -8 


“I 
~» 


«40 -80 
+40 -80 
-82 
- 84 


” 
‘ 


© 
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3 
4 
5 
6 
7 
8 
.9 


1D OAHM Soa 


SA DAMN OODAIOWUS 


3 «4 5 6 “7 -8 -9 
The value of ryg,5 when ry, is equal to 


“4 
Ly 
we 





A minus placed after a five, for example .65-, means that this .65 was not quite .65. A plus after « five means 
that the rounded off number was greater than five. A five with a dot above it, 5, means that it is exactly five. 
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AN EMPIRICAL TEST OF SAMPLING 


by 
Donovan A. 


Johnson 


Stillwater High School, Stillwater, Minn. 


and 


Alvin C. Eurich 
University of Minnesota 


The theory of sampling has been ap- 
plied widely to psychological and education 
al data. Probable and standard errors have 
been used in these fields until the highly 
trained mathematician and the novice alike 
regard the results with a great deal of 
skepticism. And well they might, for too 
often formulae have been applied when none 
of the assumptions underlying them are ful- 
filled or when it is not known that any of 
the assumptions are satisfied. Much can be 
gained, it seems, through empirical tests 
of the theory of sampling with a wide vari- 
ety of data. This is particularly true 
where it is desirable to observe annual 
trends and the summarization of data for 
the entire population is too laborious and 
expensive. The present study deals with a 
situation of this type. 


THE PROBLEM 


Each year the Minnesota State Depart- 
ment of Education collects considerable in- 
formation concerning all teachers and school 
administrators in the public schools 
throughout the state. Not only the State 
Department but educational agencies as well 
are concerned about the trends in the qual- 
ifications of this group. It is a very 
practical question, therefore, to ask how 
large a proportion of the total group is 
necessary in order to obtain reliable re- 
sults for the data collected annually. To 
answer this question all the data collected 
in September 1931 from 3,437 high school 
teachers, principals and superintendents 
were analyzed by ten percent samples and 
by various combinations of these samples. 





The reports required of all school 
systems in Minnesota contained information 
in regard to the class of school, the nun- 
ber of periods in a school day, the length 
of each period in minutes, and the names of 
the superintendent, principal, and teachers. 
For each person on the staff, the following 
information was supplied: the kind of cer- 
tificate held, date of expiration of the 
certificate, the major and minor fields as 
given on the certificate, name of school 
from which he was graduated, date of grad- 
uation, course taken, years of experience, 
subjects taught listed by periods and by 
grades, and annual salary. In general the 
types of statistical constants checked for 
reliability were percentages, medians, and 
quartile deviations. This paper mst be 
limited to a few representative samples of 
the wide variety of analyses that were 
made. However, the total picture was ex- 
ceedingly consistent in showing that reli- 
able results can be obtained for practical- 
ly all data when less than a third of the 
reports are used. 


METHODS OF ANALYSIS 


To facilitate the analysis, the re- 
sponses on each report were converted into 
a code and punched on a Hollerith card. 
These were readily sorted and tabulated 
mechanically. 

The samples used throughout the study 
were selected in the following manner. 
First, the cards were alphabetized by towns 
or names of the schools and by names of the 
teachers, principals, and superintendents 
within the schools. These cards were then 
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divided into ten samples of approximately 
10 percent each. The first lot was se- 
lected by taking every tenth card from the 
files after they had been arranged as e6x- 
plained above. This lot then contained the 
tenth card, twentieth card, thirtieth card 
and so on through the entire files and was 
designated sample number zero, which was 
punched on each card. Sample number one 
was selected by taking the first, eleventh, 
twenty-first card and so on through the 
files. Likewise, sample number two con- 
tained the second, twelfth, twenty-second 
card and so on. This method was followed 
throughout, dividing the group into ten 
samples selected at random according to one 
of the best mechanical methods. 

The percentages, medians, and quartile 
deviations were obtained for the various 
samples as well as for the entire data. The 
sample percentages, medians, or quartile 
deviations were considered reliable if they 
did not deviate by more than * 4 P E from 
the corresponding values for the total pop- 
ulation. The probable errors were based 
upon the entire group because the sample 
probable error proved to be too large for 
practical application. There can be no 
question concerning the interpretation of 
the results when this method is used be- 
cause the probable error based upon the en- 
tire data provides a more rigid test of re- 
liability than would be obtained through 
the use of probable errors based upon the 
frequency within each sample. 

The assumptions underlying this check 
of the reliability of samples are those 
generally made when probable errors are ob- 
tained. To understand fully the nature of 
this analysis, specific attention must be 
directed to the following two assumptions: 

(1) If the same percentage or other 
statistical constant is determined for an 
infinite number of random samples that are 
relatively large and of equal size, the 
values will distribute themselves in a nor- 
mal distribution with the mean equal to the 
true value obtained from the entire group. 

(2) The entire group used in this 
study is merely a sample of a still larger 
group in which the larger sample values al- 
so distribute themselves normally. This as- 
sumption is reasonable since all Minnesota 
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teachers, principals and superintendents 
might be considered a sample of the teach- 
ers, principals and superintendents in the 
Northwest. It was necessary to make this 
assumption in order to obtain the probable 
errors of the constants derived from the 
entire Minnesota group. Regarded as true 
percentages or medians the probable errors 
would have no meaning. 

The validity of the first assumption 
was tested by distributions of frequencies 
derived from ten percent samples and by 
averages of the values obtained from ten 
percent samples. Since in terms of the 
original samples it was possible to secure 
only ten values, a situation had to be hy- 
pothicated with a larger number of samples 
of equal size. This was done by adding the 
frequencies of two ten percent samples and 
dividing the sum by two. In this manner 
all possible combinations of the ten sam 
ples taken two at a time were added to- 
gether. To the results from the 45 pos- 
sible combinations were added those of the 
original ten samples, thus making a total 
of 55 samples. While the result derived 
from the 55 samples is probably not a true 
picture of an infinite number of samples, 
it tends to approach that situation. 


RESULTS 


The data on the number of graduates 
of the University of Minnesota in Class B 
four-year high schools will serve as a 
typical illustration of the distributions 
obtained. In Table I the number of Minne- 
sota graduates for each ten percent sample 
is given. Thus, in one ten percent sampl 
there were 16 graduates of the University 
of Minnesota; in another there were 20; in 
three, 26; etc. On the bottom line appears 
the distribution of the results for the 
hypothetical situation of 55 ten percent 
samples obtained by the method described 
above. The nature of these distributions 
is clearer in Fig. 1. The points located 
by the X's represent the data for the orig- 
inal ten percent samples and my there- 
fore be read as follows: for one sample 
the number of graduates from the University 
of Minnesota is 16; for another sample, 20; 





ett oe 


ee oe 














ek 


APR ae a 





176 JOURNAL OF EXPERIMENTAL EDUCATION 





Volume III, No. 3 


and for three others, 26; etc. The continu- 


ous line represents the distributions for 
the hypothicated situation and the broken 
line represents the smoothed frequency sur- 
face. The number of Minnesota graduates in 
the total population as estimated from the 
mean of the frequency distribution for the 
55 samples is 281; the actual number of 
graduates in the total group is 273. Since 
this degree of similarity appears repeated- 


ly throughout the results, the first assump- 


tion that the mean of the sample constants 
is equal to the true values is practically 
realized. 

To illustrate further the analysis of 
the reliability of percentages for samples 
varying in size, the data on the number of 
periods in the school day have been selected 
as representative. In regard to this item, 
Table II contains the following data: 


TABLE I 


DISTRIBUTION OF THE NUMBER OF GRADUATES OF THE UNIVERSITY 

OF MINNESOTA IN CLASS B, FOUR-YEAR HIGH SCHOOLS, FOR EACH 

TEN PERCENT SAMPLE AND FOR THE AVERAGES OF THE TEN PER- 
CENT SAMPLES 





Number of individuals 
16 18 20 22 24 26 28 30 32 34 36 Total 


10% Samples 1 1 a. =.= 2 10 
Averages Based 

on combinations 1 1 2 5 5111110 6 2 1 += 55 
of samples 


— 





TABLE II 


THE DEVIATIONS OF SAMPLE PERCENTAGES FROM THE PERCENTAGES 
OF THE TOTAL GROUP OF TEACHERS, PRINCIPALS, AND SUPER- 
INTENDENTS IN CLASS B, FOUR-YEAR HIGH SCHOOLS HAVING 

VARIOUS NUMBERS OF CLASS PERIODS IN THE SCHOOL DAY 





Deviations of Sample Per- 
centages of Total 
Number of Percentage of Group 
Periods Total Group PE 10% 20% 30% 40% 50% 


of Comples 
a 


Num ber 


N 
oO 
L. 
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Number of Gredustes 


| Fig. 1. The mumber of graduates of the University of 
Minnesota in Class B, four-year high schools 
| as estimated by ten percent samples of t’ 

| total group of teachers, principals, < 

| superintendents. 
| 
| 


1. The percentage of the entire group 
| within class B, four-year hich schools for 
each number of periods in the school day. 

_ (True percentaze. ) 

2. The probable error of the percentace. 
3. The deviations of the sample per- 

| centages from the true percentage. 

4, The minimum size of reliable sam- 

_ ples for each percentage as indicated by 

| the star. 

5. The total number of individuals in 
each category. 

The second column of the table indi- 
| cates that 8 per cent of the total group of 
| teachers, principals and superintendents 
| were employed in school systems with nine 
| class periods in the school day. The per- 
centages for both the 10 and 20 percent 
| samples deviate from that for the total by 
|-2. For the larger samples the deviation 
| is -1. The remaining portion of the table 
| likewise reveals that for practical pur- 
poses a 30 percent sample is large enough 
to yield reliable results for the particu- 
lar type of information included. 


| 
| 
| 
| 
| 
| 
| 





Fig. 2 represents graphically certain 
data concerning the proportion of teachers, 
principals and superintendents connected 
with schools having different numbers of 





9 8 6 -2% -2 -2 -1 -1 
8 61 1.0 -3* -l -l1 -1 -l 
7 14 -7 S* 2 0 .¢) .e) 
6 17 8 OF O 1 1 1 
Total 1090 1053 212 S27 459 540 


periods in a school day. The shaded columns 
represent the proportions derived from 
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10 percent samples; the columns in outline, 
the 20 percent samples; and the black col- 
wms, the entire group. While again this 
sraph is merely representative, it portrays 


the fact that for the data analyzed, the re- | 


sults obtained from the 10 and 20 percent 
samples are strikingly the same as for the 
entire group. 

A summary of reliable sample percent- 
ages for data on length and number of peri- 
ods related to class of school appears in 
Table III. In terms of the criterion set 
up it may be seen that 42 of the 50 per- 
centages are reliable when based on a 10 
percent sample, five additional require a 
20 percent sample, and only three of the 50 
require a sample as large as 30 or 40 per- 
cent to be reliable. 
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The proportion of teachers, principals, and 
superintendents in schools with different 
numbers of class periods in the school day 
as determined by a ten percent, 4 twenty 
percent sample, and the entire date. 


TABLE III 


THE DISTRIBUTION OF RELIABLE SAMPLE PERCENTAGES 
FOR DATA ON LENGTH AND NUMBER OF 
PERIODS RELATED TO CLASS OR SCHOOL 


Class of 
School 





10% 20% 


1 
2 


30% 40% 
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| of S percent are reliable. 
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In testing the reliability of sample 
percentages, 403 different items were anal- 
yzed. The total number of individuals in 
the separate groups varies from 0 to 1,090. 
The percentages range from 0 to 98. These 
figures give some indication of the compre- 


' hensive treatment of the data by samples. 


Throughout, the results are consistent in 
showing that percentages based on a sample 
Only 17 of the 
403 percentages based upon a 3 percent 


| sample deviate more than four probable er- 
rors from the corresponding percentages of 
| the entire group. 
| 30 percent sample had been used and 
| preted in terms of the probable errors based 
| upon these samples, they would not differ 

| Sipnificantly from the results derived from 


If the results from any 
inter- 


the total group. 
A check of the reliability of salary 
medians was made by grouping teachers, 


| principals and superintendents according to 


(1) place of graduation, (2) kind of degree, 
(3) college course, (4) class of school, 


| (5) kind of certificate, and (6) position 


and experience. The deviations of sample 


| medians from the true median annual salary 
| for graduates of various groups of colleges 


are given in Table IV. In Fig. 3, similar 


| data are shown for educators in various 


classes of schools. On this graph, the 


| horizontal axis represents the size of sam- 


ple, and the vertical axis represents the 
deviation of the sample medians from the 
true median annual salary in dollars for 

each class of school. The horizontal zero 


| line represents the true median. The limits 
| of reliability (+ 4 P E) are indicated on 

| both sides of the diagram, the curve and 

| its reliability limit having the same leg- 
| end. 


A study of the graph shows in gener- 
al that the curves for different classes of 
schools approach the true medians as the 
size of the sample is increased. With a 
20 percent sample for class 0 schools, the 
curve barely comes within the limits of re- 
liability. With a 30 percent sample, the 
deviation from the true median is less than 
one probable error. With samples of 40 per- 
cent or larger the variation from the true 
median is very slight. Only for class 3 
schools are the samples of W percent or 
more unreliable. Since only 105 individuals 
are included in schools of this class, it is 
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not surprising that the results derived The median annual salaries for teach- 

from samples are not more reliable. ers, principals and superintendents grouped 

according to class of school are shown in 

; iene Fig. 4 for a 20 percent sample, a W per- 
cent sample, and for the entire data. Clear- 


| THE DEVIATIONS OF SAMPLE MEDIANS FROM THE TRUE MEDIAN ly, a 3O percent sample yields a picture not 
j ANNUAL SALARY FOR GRADUATES OF VARIOUS COLLEGES unlike that for the total group. For all 
; 
: 











classes of schools except 3, the median 
based upon 30 percent deviates less than 

: Sample Medians 

i 10% 20% 80% 40% sot one probable error from the median based 

Place of Total Group ,, Sam- Sam- Sam- Sam- San- upon the entire data. 

Graduation Number Median ~ “Md ple ple ple ple ple In Table V may be found a summary of 

— Wor wo ee ee «| the minimum size of sample yielding reli- 


Deviations of 





— sd 1,498 1,582 7.6 -13 5 ll 6 2 able median salaries for the various cate- 
Outside of gories. A total of thirty-six distributions 
Minn. was analyzed. The frequencies within these 


U.of Minn. 691 1,420 11.6 -26% 7 -5 Oo -2 
No report 128 


ere eee 


groups range from 49 to 1,858. The median 
annual salaries range from $1,150 to 





| 

) 
920 1,680 16.6 160 65 43* 15 5 | 
| 

| 

| 







Total 3,437 $2,040. Only one of the 36 distributions 
; Stell ae required a sample larger than thirty per- 
i “tho cantians shee sf qnegle Cad gives « seli@ie se- cent. For 13 distributions a 10 percent 
: sult in that category. 
sample was sufficient. 
i Clase of = 
Sehesl PE i 4 
° ae | 
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Fig. 5. The deviations of median anmal salaries of teachers, principal, amd superintendents in each class of school ¥ 
as determined by different samples from the median anmual salary for the entire group within each class of 2 
school. 4 
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20% Semple 


bad 


4 30% Sample 


B Entire Date 


Clase of Scheel 


The median anmal salaries of teachers, princi- 
pals, and superintendents in different classes 
of schools as derived from a twenty percent 


sample, a thirty percent sample, and the entire 
group. 


TABLE V 


A FRECUENCY DISTRIBUTION OF THE SIZE OF RELIABLE 
SAMPLE IN THE DETERMINATION OF MEDIAN ANNUAL 


Johnson 





SALARIES FOR VARIOUS CLASSIFICATIONS 


Frequency of Reliable Medians 
104 20% 50% 40% 
Sample Sample Sample Sample 

Class of School 4 

Course in College 

Kind of Degree 

Place of Graduation 

Position and Experience 

Kind of Certificate 


Classification 


Total 





The reliability of the semi-interquar- 
tile ranges was also determined for 27 
croups. Again a 3 percent sample proved 


and Alvin C. 
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| 
| 
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| 
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to be adequate. The variations from the 


| true value, however, were considerably 


greater than for the percentages or medians. 


SUMMARY 


This study is a practical test of the 
theory of sampling. Specifically, informa- 
tion concerning the qualifications of high 


school teachers, principals and superin- 


| tendents in Minnesota was collected in re- 
| gard to all such persons employed in the 


| state--a total of 3,437. 


The total popula- 


| tion was divided into ten random samples. 


| various combinations of samples. 


Data were analyzed for each sample and for 


While it 
is difficult in this brief space to give a 


| complete picture of the analysis, it may be 
| said in conclusion that the results support 


the following generalizations: 
1. A ® percent sample of high school 
teachers, principals and superintendents 


| within the State of Minnesota is sufficient- 


ly large to represent the entire group in 


| dealine with data concerning teachers' 


| qualifications. 


The use of larger samples 
does not increase the reliability suffi- 
ciently to warrant the time and effort re- 
quired. 

2. The percentages, medians and quar- 
tile deviations based vpon 30 percent of the 
group deviate more than four probable errors 
from the result for the total group slightly 
more than five times out of 100, 

3. The method of sampling used in this 
study may be used in annual investigations 
of teachers! qualifications in the State of 
Minnesota. 

While these generalizations are sound 
for the data that have been analyzed, no 
implication is warranted extending the ap- 
plication of these results to other types 
of data. It is not possible to infer that 
a 30 percent sample of any group is_ suffi- 
cient to yield reliable results for other 
types of data. Each situation must be 
tested separately. 
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THE INTERRELATION OF D, V, T, AND P SCORES 
by 
J. DeWitt Davis 
Director of the School of Education 
Texas College of Arts and Industries 
Kingsville, Texas 


More and more the use of objective 
measurement is being relied upon in all 
phases of the educational process. The set- 
ting up of objectives for teaching, the 
study of individual differences in order to 
know how each student can be best guided in 
the school program, the measurement of out- 
comes in terms of pupil changes, all of 
these procedures require statistical anal- 
ysis. One of the phases of this analysis 
involves the comparisons of scores. That 
is to say, scores made on one test require 
comparison with those made on other tests 
when the two series of scores in their 
first state are not directly comparable. Ad- 
vance textbooks to date have touched upon 
some of the factors here considered. The 
purpose of this paper is to bring some of 
these procedures together into one brief 
treatise for more ready reference. 

To do this effectively requires first 
a statement of meaning involved in the word 
comparable. For the specific consideration 
here presented this implies a common cen- 
tral tendency and a common deviation unit. 
In most cases the population being examined 
and compared one with the other can be rep- 
resented as a normal distribution, and for 
that reason the functions of the normal 
curve are involved in the discussion. With- 
out sufficient numbers and adequate 
sampling to justify this assumption, exten- 
sive statistical study of scores can hardly 
be justified. To facilitate understanding, 
the following brief statements are given 
concerning the meaning attached to the va- 
rious scores which are deemed comparable. 

D-scores are variously called standard 
deviation scores, Z-scores, x/o, and d/o 
scores, D-scores are all directly compara- 
ble because each D-score distribution has a 


i] 








mean of zero and a standard deviation of 
one. 

V-scores are comparable, having been 
converted from a given series with its om 
central tendency and deviation measure to 
values in another series with a different 
central tendency and a different deviation 
value. 

T-scores are comparable, having been 
reduced to a common series that has a mean 
of 50 and a standard deviation of 10; that 
is, they are transformed to a given stand- 
ard in central tendency and dispersion. 

P-scores, or percentiles, represent 
points on a given score scale below which 
a certain percentage of the total popula- 
tion measured by the scale lies; 6.g., P- 
score 31 means that 31 percent of the popu- 
lation lies below, or 69 percent above, the 
raw score involved. Because they locate 
relative positions within a given group 
they too are comparable scores. 

These particular symbols are proposed 
merely to facilitate their recall. They 
all sound alike--D, V, T, P--and, in fact 
have much more than sound in common. Each 
possesses, moreover, peculiar advantage for 
certain analyses. Each letter used as the 
score name may be thought of as the in- 
itial letter of the word that characterizes 
its unique meaning. Thus D stands for "de- 
viate"” on the base line of the normal curve 
involved. V stands for "“vertere", Latin 
meaning to turn, to change from one to an- 
other. T stands for "transform", to make 
over from a given raw series to a standard 
one having a mean of SO and a sigma of 10. 
P stands for "percentile", or point on a 
given scale below and above which certain 
portions of the total population involved 
are distributed. 
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Whenever raw scores are reduced to a 
comparable basis certain assumptions are 
made. (1) The traits being measured are 
supposedly normally or comparably distrib- 
uted; (2) the measures employed are compar- 
ably reliable; (3) the cases involved are 
adequate in number and selection to repre- 
sent a random sampling. Given approximately 
these conditions one is probably justified 
in reducing his scores to one of the pro- 
posed comparable bases. 

D-scores: In a normal distribution 
which can be represented graphically by the 
Jaussian curve each score may be thought of 
as a locus on the base line of that curve. 
The perpendicular to the base line that cuts 
the normal area into two equal parts is at 
the mean and its base-line value ina dis- 
tribution of D-scores, is zero. Perpendic- 
ulars erected +lo to the right and -lo to 
the left of this zero point include approx- 
imately 68.25 percent of the total area of 
the curve. That is to say, the unit of 
base-line variation is one sigma and +3 of 
these sigma units comprehend approximately 
99.7 percent of the total curve area. With 
these things in mind one can proceed to re- 
duce any series of raw scores to the D-score 
basis. The comparability rests in the fact 
that the means have been made equal to each 
other by transformation to the D-score 
basis. The mean scores, under these condi- 
tions, in normal curve-base-line units are 
equal to zero, and the sigmas are each equal 
to unity, for the same reason. To convert 
a score or a series to the D-score basis 
the following formula is employed. 

- X-M da 
Dy az er * (1) 

In this formula Dx equals the D-score 
corresponding to X, its raw score; and Mx 
is the arithmetic mean of the X series; and 


=d 
N 


in which the small, or case letter d repre- 
sents the distance in raw score points above 
or below the group mean; that is, d=X — M,. 
V-scores: One may wish to make a se- 
ries of raw X-scores directly comparable to 
another series of raw scores Y, secured by 
the same individuals or their controls on 


Ox 
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another scale, which has a different mean 
and a different variation unit. The formula 
useful in this case involves the same as- 
sumptions as does that for D-scores. Part 
of the technical procedure is similar also. 
A statement of the formula shows this to be 
true. 


Vz ® My + Oy Dy (2) 


In this formula V, equals the con- 
verted X-score in terms of the Y series, 
which central tendency is My and which de- 
viation measure in raw score points is o,. 
D, is the same as that used in formula (1) 
above. If only one score is wanted, D, in 
the form of 

X - M, 
Ox 

is found, and formula (2) then can be em- 
Ployed as given above. If the whole X se- 
ries is to be converted, time can be saved 
by reducing the formula to a simpler state, 
in which form it will require fewer .mathe- 
matical computations, as follows: 


X - M, 


Since D, = (1) above 


Then oyD; of (2) = o> 


= 


\ 
*) = °Y (X-Mx) 


Ox 


Equation (2) becomes then, by substi- 
tution, 


5 F(x - 
1 a My + — M,) 


In the right-hand member of this last 
equation there are now four constants, name- 
ly, My, oy, Mx and o,, all of which my be 
collected as follows and equated to C, a 
large constant, thus: 


CeM,- 2M, 
Ox 


Equation (2) then becomes: 


but > | is also a constant. Let this be 
oy 


represented as c'. 
We then get, 


vy =C + cx 
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When a calculating machine is at hand 
this formula is very useful. Put constant 
c' in as the multiplicand for the whole se- 
ries, multiply it by each variable raw score 
X, and add the respective products to con- 
stant, C. The total in each case is the 
converted score, V;. A converted score may 
be compared directly with its corresponding 
score Y, for now they have the same mean 
and the same standard deviation. 

T-scores: The only essential differ- 
ence among the D-scores, V-scores, and T- 
scores is that each series has a different 
mean and sigma. The D-scores have M = 0 
and o = 1.- V-scores have M and o both 
equal to those of the series with which 
they are to be compared. T-scores, arbi- 
trarily set a mean and a sigma, generally 
at 50 and 10 respectively. Any other ar- 
bitrary mean and variation units might be 
used. These are suggested, not only be- 
cause others have employed them, but pri- 
marily because they are easy to work with. 

The formula to use in reducing scores 
to this T-score basis is built out of for- 
mula (2) as follows: Since M, = 50, and 
oy = 10, by arbitrary assumption, then by 
substituting in (2) 


T, = SO + 10D, (4) 


As before, if only one T-score is de- 


sired, Dy, in the form of a= is found: 
x 

next multiply by 10, by moving the decimal 
one place to the right, then add this prod- 
uct algebraically to 50. Corresponding T, 
scores from another raw series of Y scores 
may be found in the same way and thus di- 
rect comparison becomes possible, T, with 
Ty. However, as before, if the whole se- 
ries is tc be put on the basis of T-scores, 
labor can be reduced considerably by sim 
plifying the statement so that it will re- 
quire fewer mathematical computations, as 
follows: 

By substituting from (1) to find the 
term ]0Dx, we get, 


X-M, 


10D, = 10 
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Ox 
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But 10 nes a constant value. Let this 
Ox 
equal c'. Substituting further: 


10D, = c'(X - My) = c'X = c'M, 


In this equation, c'M is also a con- 
stant in value, a product of two constants. 
The new mean 50 in equation (4) above is 
also a constant. If all three of these 
constants, are combined then the large con- 
stant, C, involved in equation (4) becomes: 


c=50-20u, = 50 - cm, 


Ox 
Substituting in the right-hand member of 
equation (4) above, 


T, = c'X + (50 = c'Mx) = c'X +C (5) 


A calculating machine is useful at 
this juncture, as before suggested. The 
small constant, c', can be used as mlti- 
Pplicand in the machine. Multiply it by 
each variable X-score respectively and add 
to each product the constant C. The 
T-scores are thus derived, ready for direct 
comparison with all other T-scores derived 
in the same way. 

P-scores: It is seen from the above 
discussion that V-scores and T-scores are 
a direct modification of D-scores. P-scores 
ordinarily are derived in a different way 
and hence are seldom thought of as being 
related to any of the others. However, 
when sampling is large enough to justify 
percentile scores, that is, large enough 
to make the distribution of raw scores fair- 
ly stable, the P-scores can also be derived 
from the raw series through the D-score 
process. An understanding of the nature of 
D-scores, therefore, is a chief desideratum 
of comparable scores. The formula useful 
in this transposition can best be put ina 
combination of letter symbols and words as 
follows: 

P, = 1,00, (total area of normal 
curve), minus A, the portion of the normal 
curve area to the right of Dx, on the base 
line, when D, is secured as in (1). 


P, = 1.00-A (6) 
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In finding the value of A the follow- | 
inc rule will be helpful. First secure a 
normal curve area table.+ When the raw 
score is less than the mean of its distri- 
pution, A = .500 plus the figure in the 
area column corresponding to the D-score of 
the raw X-score. When the raw X-score is 
creater than its mean, A = .500 minus the 
figure in the area colum corresponding to 
the D-score of the raw X-score in hand. 

An illustration will make the pro- 
cedure more clear. If a raw score in X se- 
ries is 60 when the mean of X is 90 and the 
o, is 15, what is its P-score or Py value? 
By formula (1) _ 60 = 90 


a oe 4 


-2.00. 


The normal curve area table shows that when 
in column x/o,, or the D-score, the value is 


2.00, the figure in the corresponding area 
column is .4772. What do these figures sig- | 
nify? In the first place this D-score is 


below or to the left of the mean, because 
60 is less than 90. Above the mean is SO 
percent of the area. Between the mean and 
the -2.00, D-score, is 47.72 percent more 
of the area. By formula (6) then, there is 
.50 + 47.72 or 97.72 percent of the area or 
number of cases involved above this score. 
That is to say, A in formula (6) is 97.72 
percent or .9772. Therefore, in this case, 
we have Py = 1.00 - .9772 = .0228 = 2.28 
percentile.* 
To find 
D-scores the 
versed. For 
or Q,, as it 


a given P-score in terms of 
same process can readily be re- 
example, given a P-score of .25, 
is frequently called, what is 
its value in the D-score series? By sub- 
stitution in (6) above: 

eo = 1,00 -A 

A, then = .75 of the total area. 

But when Dx = 0, 50 percent of the 
total area is to the right of that point. 
Therefore, D, must be less than O, or nega- 
tive, to the left of O on the base line suf- 
ficiently far to comprehend .75 - .50 or 
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.25 more of the total area. Consult the 
table of curve areas and find that the Dx 
or x/ox which corresponds to .2500 in the 
area colum is .6745. Since the P-score 
is below the mean (.50 is the mid value of 
percentiles) this D-score is negative, or 
-.6745. Therefore, Q, or R score of .25 
= a Dy score of -.6745. 

Because Quartiles and Deciles are fre- 
quently useful in analyzing raw scores the 
following table will be found a material 
It is compiled by using the method 


TABLE I 


QUARTILES, DECILES, AND CORRESPONDING D-SCORES 


=—_= 














| B thet sine Sauer) 
Quartiles | | 
and Py D, | 
Deciles 
é : an Sao . | 
Qa 25 | -.6745 
Qs | igre “4 -6745 
Dy | .10 | -1.2817 
De | .20 | -.8418 
Ds | .30 | -.5344 
| De | 40 | -.2533 
| Ds | .50 -0000 
| De | .60 ~2533 
Dy 70 | 5344 
De 20 | .8416 
| D, -90 | 1.2617 | . 








The adequacy of these relationships de- 
pends upon the appropriateness of the basic 
assumption involved; namely, that the group 
being considered is fairly normal. From 
this discussion it is hoped that the reader 
will be better enabled to make his differ- 
ent series of scores comparable. If he is 
given the group mean and the group sigma, 
any or all of the scores of a given series 
can be reduced by the suggested methods. : 
Or, given the scores and their P,, the Dx ; 
can be computed. From Dx the Vx or Tx may 
be computed by employing the formulae here- 
in derived and presented. 





l. Morton, Robert Lee, Statistical Tables,—Wew York: 
area tabulation. 





2. Note that in speaking of percentiles frequently the decimal is omitted. 


centile 2.28. 


Silver Burdette and Co., pp. 44-47, or any other normal curve 


A P-score of .0228 is the same as per- : 
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THEIR CONSTRUCTION AND USES* 


by 
Hal D. Draper 
Fresno State College 
Fresno, California 


It has become widespread practice 
in the field of education to em 
ploy the normal probability curve is assign- 
ing course "grades" or "quality marks"; a 
more or less standard distribution for nor- 
mal classes being: 7% A's and F's (fail- 
ures); 24% B's and D's; and 38% C's. This 
distribution of grades is based on the fol- 
lowing application of the probability curve, 
the equation for which is yY=Y 


me 
3 
e 20% 


where y, is the height of the central or- 
dinate, e is the base of the natural logar- 
ithms, and o is the so-called standard de- 
viation. Taking the central ordinate at 0 
(zero) on the base line, distances to the 
right being positive, ordinates are erected | 
at x = 41/20, and at x = 43/20. The letter 
grades are assigned to intervals along the 
base line as indicated in the upper scale 

of Fig. 1, page 185. 

The area between the curve, the base 
line and the two ordinates is a measure of 
the number of scores falling in each grade 
for a “normal distribution", and yields 
38.30% for C, 24.17% each for B and D, leav- 
ing 6.68% each for A and F. While as indi- 
cated in Fig. 1, it is necessary to proceed 
from -=- to +~in order to include 100% of 
the scores in a theoretical array, cutting 
off the base line at +2-1/2c0 neglects only 
1.24%, and experience shows that in an ac- 
tual array of test scores, these limits are 
rarely exceeded. Cutting off the base line 
at 23-1/30 neglects only 0.08% of a the- 
oretical array, and an actual array very 
rarely yields a score outside of these limits, 





The whole theory upon which the prob- 
ability curve is based shows that its use 
in educational work approaches validity on- 
ly for fairly large classes of non-special- 
ized students: to apply it to small class- 
es, or to advanced classes--whether large 
or small--probably is not justified. In 
the hands of a teacher who recognizes its 
limitations, however, it can be used as a 
valuable guide in assigning grades even in 
small or advanced classes. In this discus- 


| sion, it will be assumed that the curve may 


be applied legitimately. 

While final grades are quite generally 
reported in the above letter-grades, or 
their equivalents, standings on individual 
tests are frequently given in terms of 
"thirds" of letter grades, (e.g., C-, C, 
and C+) by dividing each o-interval on the 
base line into thirds as indicated in the 
middle scale of Fig. 1. Many teachers feel 
that the validity of their tests enables 
still finer gradations of quality to be dis- 
tinguished. The writer proposes the adop- 
tion of a suitable Standard Score Number 
Scale (SsNS) to meet this need. The SsNS 
is merely a set of numbers applied to an 
equally spaced scale along the base line of 
the curve with the zero at a suitable dis- 
tance to the left of the central ordinate, 
which represents a "middle C" grade, and so 
adjusted that divisions between thirds of 
letter-grades will fall halfway betwee: 
numbers on the scale. 

In addition to the advantage of provid- 
ing a means for expressing any desired de- 
gree of gradation in quality within a let- 
ter-grade, such a SsNS, by reducing 





l. References to the literature have been largely omitted from this paper. In a paper entitled "Marks and Marking 
Systems: A Digest", Jour. of Educ. Research, XXVII (December, 1933), pp. 259-272. A. Duryee Crooks has prepared 


a rather extensive bibliography covering this field to which the reader is referred for more detailed information. a 
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several different test scores to the same 
basis of comparison, enables one to compute 
quantitatively a student's average 
"weighted" if desired) by standard meth- 
ods--a task which is almost impossible when 
letter-grades are used. For example, take 
the case where a student's average is to be 
jetermined from the results of two mid- 
term and one final examination, the final 
to count twice as much as a mid-term. If 
test grades such as C, Bt and C+ are re- 
corded for the student, it is difficult to 
jecide whether his average should be a Be 
or a C+, 
57.5 < C < 63.5 < C+ < 69.5 < Be < 75.5 

< B+ < 81.5, and recording the three scores 
as 62, 80 and 69, enables one to determine 
the "weighted average” as follows: 

(62 + 80 + 2 x 69) + 4 = 70, which gives a 
B- as the correct average. If the three 
test grades are 58, 76 and 69, however, 

the weighted average is only 68, giving C+ 
as the correct average grade for the stu- 
dent. Another method for determining the 
weighted average of a series of Ss-numbers 
will be given later. 

In constructing and using a S;sNS, sev- 
eral principles must be thoroughly under- 
stood. First, it must be strongly empha- 
sized that S;-numbers are not to be con- 
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The normal distribution, letter-grades, and standard score norms 


fused with "percentage-of-achievement" 
marks. Teachers (and students) who are 
familiar with the rather widely used "per- 
centage-of-achievement-scale” which makes a 
test score of less than 60% an F; 60-69% a 
D; 70-79% a C; etc., are disturbed to find 
that Ss-numbers derived from certain poorly 
chosen S,;NS's, have values widely divergent 


from percentage marks--a definite "positive 


achievement" may receive a negative Ss;-num- 


| ber on the one hand, or a very low percentage 
mark may receive a disproportionately high 
|Ss-number on the other. 

By adopting a SgNS such that, say 


It must be admitted that when there is 


|a large discrepancy between the Ss-numbers 
and the empirical scores (Sg-numbers) on a 


test, a bad psychological effect may be pro- 
duced in the student. Thus, if on a test 
an Sg of, say, 30 is transmuted into a Ss 
of -5, the student is likely to be resentful 
and puzzled to account for a definite "posi- 


| tive achievement" yielding a negative score. 


On the other hand, if the Sg of 30 yields a 
Ss of 65 (which still may be a "failing" 
grade), the student is likely to feel that 
he has been "pretty lucky" to get a mark of 
more than twice the number of points answered 
on the test. -In either case, he is not en- 
couraged to put forth greater effort in his 
subsequent work. 
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In using a SsNS, therefore, it must be 
borne in mind that to overcome the difficul- 
ties mentioned above, each test should be de- 
signed to fit the SsNS chosen. After discus- 
sing the principles involved in constructing 
the SsNS, we will be in a position to deter- 
mine the nature of the tests which will fit 
the scale. 

If Ms is the Ss-number which is to rep- 
resent a "middle C" grade; My the mean 
(average) of the Sg's in a test (which is to 
be set at the "middle C" grade); nthe num 
ber of degrees of gradation in quality to be 
distinguished in each third of a letter- 
grade, (1.e., 3 is the number of grada- 
tions within an interval of lc on the base 
line of the curve); then in a test for which 
the standard deviation, o, has been deter- 
mined, the standard score, Ss, corresponding 
to any given empirical score, S,, on the 
test is given by the equation 


3 
Ss = Ms + —" (Ss: - Mg) (1) 


This may be rearranged to give a form more 
convenient for computing the standard scores 
on a calculating machine as follows 


S, = Snls, - (Me - Mee) (2) 


Equation 2 shows that the zero on the SsNS 
will correspond to an empirical score 


Mco 

. 
Now, both from the psychological rea- 

sons mentioned above, and the practical con- 
sideration that negative numbers are more 
difficult to handle in computing averages, 
negative standard scores are undesirable. To 
avoid obtaining negative Ss's, we see from 
Equation 2 that the term in the square 
brackets must never have a negative value. 
This implies that we must make Ms/n Jarge 
enough so that (Mso)/3n is practically al- 
wayS numerically larger than Mg; thus making 


Se = 


(3) 
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| 
| 
| 
| 
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the term in brackets positive even when S; 
= 0. As pointed out above, scores in an ac- 
tual array very rarely fall below S; 
= - 3-1/30: setting the zero on the S-NS at 
this point on the base line of the curve, in 
the writer's experience, has been found to 
give a very satisfactory scale from the 
standpoint of rarely yielding negative S.z's, 
Of course, setting the zero on the SsNS fur- 
ther to the left will still further decrease 
the possibility of obtaining negative S,'s, 
but from the relationship between Ms and n 
in Equation 2, this requires an undesirably 
large value for Ms or too small a value for 
Ne 

By adopting -3-1/30 for the zero point, 
we thereby fix the value of Ms = 10 7 : 
Equations 1 and 2 then become 


3 


Ss = 10n + (Sp - Mg), and (4 
oO 


Ss = Sp - (Mg - 3-1/30) | (5 
0 

An examination of Equations 1 to 5 
shows that only two of the three terms, Mz, 
n, and the O-point on the SsNS, can be ar- 
bitrarily fixed: having chosen the zero-point, 
we may set either Ms or n, but not both. Thus 
if it is desired that M, should be set at, 
say, 70, then n= 7, i.e., all tests which 
are designed to fit this SsgNS should be 
capable of distinguishing valid gradations 
of quality of 7 points within a third of a 
letter grade. This requires (as shown be- 
low) that each test should be capable of 
giving not less than 140 distinct scores. 

These equations tell us further, that 
the points on the S,NS separating the 


| thirds of letter-grades fall at odd multi- 


ples of n/2: in order, therefore, to have 
these points fall halfway between numbers 
on the scale, for even values of 7, it is 
necessary to add or subtract 1/2 to the 
left-hand side of Equations 4 and 5. For” 
an even number, then we will use the fol- 
lowing equations 





l. Mauy teachers feel that they are increasing the capacity for distinguishing gradations in quality of a test having 
& small number of items by giving a large number of "points" for each item: this is entirely erroneous. If a test 
contains, say, 14 problems for each of which 10 "points" are given, with a possibility of obtaining "half credit", 
instead of yielding a test capable of distinguishing seven gradations within a third of a letter-grade, only 28 
distinct scores are possible, making for only one (or at most, two) valid gradations within e third of a let- 
ter-grade. 


varch, 1935 


10n 12+" (s,- My), or (6) 


= 12 +5" 3, - (My - 3-1/30)) (7) 


Tables I and II, page 188 give the 
yalues of the Ss-numbers which fall between 
the thirds of letter-grades, at the mid- 
point for odd and even values of n, respec- 
tively. These are taken directly from 
equations 5 and 7. Values of n= 5 and 
n = 6 are also shown as an illustration. 
These are the values that the writer recom 
mends as being the most suitable for the or- 
jinary examination, odd numbers being pref- 
erable to even, in general. 

We are now in a position to discuss 
the method of adjusting a test to fit the 
particular S,NS which may be adopted. 
First, the number of items in the test 
should not be less than enough to provide 
about 20n distinct scores (see footnote on 
page 186). Second, the difficulty of the 
{tems should be such that the poorest stu- 
lent will get some score, whereas the best 
student is not likely to answer more than 
about 90% of the items. This implies that 
the teacher has some knowledge concerning 
the abilities of his students, and also 
some information about the difficulty--from 
the students! standpoint--of the items he 
proposes to use in the test. For a teach- 
er who has considerable experience with 
teaching a given course, the use of objec- 
tive type tests which have been employed a 
number of times in preceding classes fur- 
nishes the best means for attaining this 
second objective. 

It follows from the above that a test 
designed to yield from 90 to 120 distinct 
scores cannot be expected to yield valid 
gradations of more than four to six points 
within a third of a letter-grade(n = 4 to6). 
In the writer's experience, with True- 
False type tests, where the score is ob- 
tained by subtracting the wrong answers 
from the right ones, a larger number of 
items must be provided to get the same 
rradation--between 25 n and 3O n has been 
found satisfactory. 

Many teachers have adopted the prac- 
tice of giving frequent short quizzes of 
from 15 to 25 items, supplemented by one or 
more mid-term examinations and a final ex- 
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amination. The course grade is then deter- 
mined from these scores by some empirical 
method of "weighting." As pointed out 
above, the use of a S<NS enables one to 
weight the scores in a quantitative--thougch 
not strictly objective--manner, since near- 
ly any method of weighting involves a large 
subjective element. It must be realized-- 
as many writers have pointed out--that as- 
Signing equal value to each item in a test 
involves in many cases a tremendous sub jec- 
tive weighting. 

One of the methods of weighting test 
scores in computing an average has already 
been described, i.e., expressing each test 
score in terms of the same SsNS (which we 
will designate the Basic S;NS), and then as 
Signing an arbitrary weight to each test. 
The general formula for obtaining a “weighte 
mean" from k Scores, 8,, Sg, 3, «+++Ax, 18 


iW; X a; 


WM = (8) 


DWy 


where w, is the weight of the i“ score, and 
= signifies taking the algebraic sum of the 
terms to the right of the symbol. 

Another method that is quite satisfac- 
tory, especially for those who give short 
quizzes together with longer examinations, 
is the following: for the short quizzes, 
involving from 20 to 25 items, express the 
Ss's in terms of a SsNS taking n = 1. For 
the mid-term examinations, express the score 
in terms of a S;NS taking n= 5: similarly, 
the final examination may be expressed in 
terms of a SsNS taking n = 10 or 20, de- 
pending upon the number of items involved, 
i.6., the number of distinct scores possi- 
ble. The mean of these scores is obtained 
by the equation 
Mp XE Ay 


WM = (9) 


4 ny 
where ng is the value of n taken for the 
Basic SsNS, 

While this equation gives exactly the 
same results as that of Equation 8 when all 
scores have beeri reduced to the Basic S<;NS, 
the latter method is better adapted to the 
use of teachers who employ the "running to- 
tal" method of recording test scores. This 
practice is to be commended since a 
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The relation of these Ss-numbers to the Normal Curve is shown in the lower scale in Fig. 1. 


188 JOURNAL OF EXPERIMENTAL EDUCATION Volume III, No. 2 
TABLE I TABLE II 
FOR ODD VALUES OF n FOR EVEN VALUES OF n 
Ss Between Letter Sg at the Ss Between bind Sg at the 
Letter-—Grades | Grades Mid-point Letter-Grades predes Mid-point 
m=5 m=5 m=6 | n 6 
- 0.57 | - 0.5 n+ 1/2+———+ 
F----+ On 0) |F--—}+ On+i1/2 0.5 
86 - Oh OP 3.5 0.5 n+ 1/2+———-+ 
F--- | ln 5 | F--- +} ln+il1/2 6.5 
7.5 1.5 n+————+ 9.5 1.5 n+ 1/e+ —— 
F-- 2n 10 |F-- }| 2@n+i1/2 12.5 
12.5. 2.5 nJ 15.5 2.5n+ a 
F- . an 15 F- + Sn+1/2 18.5 
17.5 3.5 n 21.5 3.5 n+ 1/2+———+ 
F + 4n 20 F + 4@4nm+1/2 24.5 
22.5 4.5 n 27.5 4.5 n+ 1/24+————+ 
F+ 5n 25 |F+ | Sm+i1/2 50 
27.5 5.5 n 33.5 5.5 n+ 1/2+——~+ 
- 6 30 + 6n+1/2 36.5 
32.5 6.5 n 39.5 6.5 n + 1/2+———+ 
D L 7n 35 D +t Tn+1/2 42 
37.5 7.5 7 45.5 757+ /2}——} 
D+ + gon 40 | D+ @n+1/2 48.5 
42.5 8.5 n+———_+ 51.5 8.5 2 + 1/27——— 
Sears a, & ic - 9n+1/2 54.5 
47.5 9.5 n | 57.5 9.5 n + 1/2+—— 
Cc + -10 7 50 | | Cc + 10n+1/2 60.5 
52.5 10.57 || 63.5 10.5 m + 1/2+———+ 
C+ lln 55 c+ L alnm+ai/e 66.5 
57.5 11.57 | 69.5 11.5 n+ Vey a 
B- + 1297 60 B- | -l2n+1/2 72.5 
as 6s a 75.5 12.5 n+ 1/e+—— 
K Lk 13m 65 |B IZn+1/2 76.5 
67.5 13.57 81.5 15.5 n+ 1/2+— 
B+ 14 n 70 | B+ + l4n+1/2 84.5 
72.5 14.57 87.5 14.5 n+ ie 
A- . 15" 75 A- n+1/2 90.5 
77.5 15.57 93.5 15.5 n+ 1/2 A 3 
A 167 80 iA + 16n+1/2 96.5 
82.5 16.5 7 99.5 16.5 n + 1/2+——— 
A+ + l7n 85 At - 219.08 4 B/2 102.5 
87.5 17.57 105.5 17.5 n + 1/2 
A++ + 182” 90 A++ + 18+ 1/2 108.5 
92.5 18.57 111.5 18.5 n+1/2 
A+++ | 1927 95 A+++ F 1g9nm+i1/2 114.5 
97.5 19.57 117.5 19.5 n + 1/2} 3 
A++++}+ (20 % 100 wide I 20n+1/2 120.5 
102.5 20.5” 123.5 20.5 n + 7 Meni 





The writer does not advocate the giving of such letter-grades as Att+++, F+, F--, etc.: for properly designed tests 
grades of At+++, F----, etc., should be met with very rarely. 
5.5m simply as a grade of F, and all scores above 16.5 n as A+: 
Fig. 1. It is to be emphasized, however, that the adoption of such a SsNS as is here recommended will take care of 


such unusual scores without difficulty. 


A better practice, perhaps, is to list all Ss's below 
this practice is indiceted in the middle scale of 


Ming Pico 
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student's standing in the class can be de- 
termined at any time with the minimum at ef- 
fort. For example, if a student has the 
successive Ss-numbers on quizzes (n = 1) of 
a, 14, 12, 9; 53 on a mid-term (n = 5); and 
14, 10, 10 on quizzes, his standing at any 
time can be recorded by summing up all the 
test scores to date. This vields the suc- 
sessive "running totals" of 8, 22, 34, 43, 
96, 110, 120, 130. If these scores are 

made available to the class, each student 
san quickly ascertain his standing relative 
to the class at any time, and by multiplying 
his total score by the current value of 
np/ing, he can determine--approximately at 
least--his letter-grade. Thus, after the 
mid-term examination in the example above, 
n, = 9, and if ng = 5, the student's score 
is found to yield (5 x 96)/ 9 = 53, which is 
a C+ grade according to the results in 

Table I. 

While the SsNS will be especially use- 
ful for objective type tests, it should be 
adaptable to use by teachers who adhere to 
subjectively scored tests on a “percentage- 
of-achievement-scale." Unfortunately, many 


teachers have not attempted to apply sta- 
tistical methods in their classes on ac- 
count of the rather formidable appearing 


mathematics involved. 

To overcome this difficulty, the writ- 
er has devised a simple form, The Draper 
Histogram and Standard Score Form--soon to 
be published--which combines the advantages 
of algebraic and graphic methods in deter- 
mining, by the normal curve, the standard 
scores for an array of empirical test scores. 
Yith this form there is no necessity for 
preparing a “distribution chart" for every 
test: tally marks are entered in a histo- 
cram form (providing for a range of 100 
points) opposite a scale giving the empiri- 
cal test score. (This S,-scale is shown 
along the left margin of the Histogram Form 
in Figs. 2 and 3.) Simple and explicit di- 
rections are given for computing by standard 
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methods»? the (approximate) median, mean, 
standard deviation, o, and the mode for the 
array. Even one who is unfamiliar with the 
theory of statistical methods will have no 
difficulty in carrying out the computations, 
since each successive step is clearly indi- 
cated. 

The greatest time-conserving feature of 
the method, however, lies in the rapid, 
graphic assignment of Ss-numbers and letter- 
grades. This is accomplished by means of a 
Standard Score Assignment Form (SsAF), the 
form with the radiating lines shown in 
Figs. 2 and 3.5 The central horizontal 
scale running from 0 (on the right) to 25 
refers to the value of o as determined for 
a particular test. The vertical scale at 
the left of the S,AF is the Basic SsNS, 
which in the illustration is taken with the 
O-point at -3-1/3c, and with n = 5. 

The method of using the form in as- 
signing Ss's for a test in which the dis- 
tribution is approximately normal (a point 
which can be determined by a glance at the 
histogram for the array) is illustrated in 
Fig. 2. For this particular test, the mean 
was found to be 56.90, and the o = 18.35. 
Tne Histogram Form is therefore placed with 
the edge perpendicular to the central scale 
on the SsAF, and with 56.90 on the S,-scale 
of the HF coinciding with 18.35 on the cen- 
tral scale of the SsAF. The points on the 
S,-scale where the radiating lines on the 
SsAF intersect then give the S,'s corre- 
sponding to the S;-numbers represented by 
these radiating lines. Thus, it can be seen 
from the figure that the empirical score of 
97 yields a standard score of 74, which is 
an A-. 

With these same forms, standard scores 
can be assigned to "skewed" arrays with a 
minimum of empirical assumptions, and these 
justifiable on rational grounds. The ap- 
plication of the forms to this purpose is 
illustrated in Fig. 3, (the same test being 
used as in Fig. 2), which obviously 





l. Rugg, H. ©. Statistical Methods Applied to Education, Houghton, Mifflin Co., (1917). 


3. 
for several years. 
this paper. 


2. Lang, A. R. Modern Methods in Written Examinations, Houghton, Mifflin Co., (1950). 
The Histogram Form shown in these figures is a crude model prepared on e mimeogreph which the writer has been using 
The ScAF shown was hastily prepared end drawn to half-scale in order to provide the cuts for 
Owing to the reduction in scale, the scale divisions on the horizontal scales were omitted, and the 
radiating lines connecting the points on the ScNS were inserted only in the "A letter-grade" region. 


On the full 


sized forms, the HF is 8-1/2" x 11", and the SsAF is 17" x 22". 
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Fig. 2. Illustrating use of Draper Standard Score Assignment Form «nd Histogrem when distribution is considered normal 
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Fig. 3. Illustrating use of Draper Standard Score Assignment Form and Histogram when distribution is considered skewed 
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is skewed downward slightly. 

The assumptions that are made in treat- 
ing moderately skewed arrays of test scores 
are these: 

1. The “theoretical mode” is a better 
measure of “central tendency” than is the 
mean or the median. The mode, calculated 
by Pearson's empirical rule, Mo = M- 3 
(M - Md), is therefore taken for the "mid- 
dle C" grade. 2 


rfxd 
2.Acde ae where the deviations, 


d, are computed from the mode should be 
utilized instead of the o as ordinarily com 
puted from the mean. 

3. The total range of Ss's from the 
bottom of the F=- to the top of the A+ --1.6., 
from 2.5 n to 17.5 n--should be taken as 
5 ot, (As already mentioned, scores out- 
side of 42.50 are rarely obtained.) 

4. A 100-percentile skewness index is 
computed as follows: Sk.In.,,, = 


A - Mo 
Mo - B 


where A is the highest and B the lowest em 
pirical score obtained in the test. 

5. The ratio of the Ss-interval from 
the top of the At (17.5 n) to the "middle 
Cc" grade (10.0 n) to the interval from the 
"middle C" grade to the bottom of the F- 
(2.5 n) should be made equal to the 
Sk. INn., 69° 

6. The intervals on the SsNS should 
give uniformly tncreasing intervals on the 
Sge-scale throuchout its length. 

It can be seen that the application of 
the above principles to the assignment of 
Ss's to a skewed array resolves itself into 
a rather simple geometrical problem when 
the writer's forms are employed. Thus, 
placing the HF on the SsAF with the edge of 
the former at an angle to the central scale 
of the latter (the angle e in Fig. 3) will 
give uniformly increasing intervals on the 
Se-scale throughout the length of the 
Ss-scale (Point 6 above), and obviously the 
interval from the top of the A+ to the "mid- 
dle C" grade can be adjusted with any de- 
sired ratio to the interval from the "mid- 
dle C" grade to the bottom of the F- by se- 
lecting the correct value for the angle 6. 
(Point 5.) 
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It is also geometrically obvious that 
rotating the HF through the angle 6, keep- 
ing the edge of the former at a fixed di- 
vision on the central scale of the SsAF, 
will increase the interval on the Sg-scale 
between 2.57 and 17.5 n on the Ss-scale,. 
This may be compensated for by translating 
the HF to the right until the left-hand 
edge coincides with some definite division 
(o") on the central scale--less than o!, 
Principle 1 may then be complied with by 
bringing the mode on the Sr-scale to the 
central line on the SsgAF. 

Spaces are provided on the HF for com- 
puting of (Point 2), the 100-percentile 
skewness index, Sk.In.,,,, and the value of 
o". By means of the "Sk.In.-scale"--the 
scale along the right-hand margin of the HF 
--and the upper horizontal scale on the 
SsAF, the angle 6 is graphically determined. 

In the illustration given in Fig. 3, 
the mode, Mo = 56.45, o' = 18.35, and 
Sk.In.oo = 0.91 (which shows that the dis- 
tribution is skewed downward slightly). A 
mark is then ruled at 0.91 on the "Sk.In.- 
scale", and crossing the adjacent "R-scale" 
at 0.98. This gives the factor by which o! 
must be multiplied in order to give o" 
(18.35 x 0.98 = 17.99 = 0"). 

The HF is then placed with the "Sk. In.- 
scale" coinciding with the upper horizontal 
scale on the SsAF with the 1.0-division on 
the former at 17.99 (o') on the latter. A 
light pencil mark is made on the S;AF-scale 
where the (previously marked) 0,.91-division 
on the "Sk.In.-scale" falls, and a light 
pencil line is ruled through this point and 
the 17.99 (o') division of the central 
scale of the SsAF. When the left-hand edre 
of the HF is then placed on this pencil 
line, and with the mode, (56.45) on the 
Sg-Scale coinciding with the central scale 
line, the situation shown in Fig. 3 is ob- 
tained. The graphic assignment of Ss-num- 
bers and letter-grades is then made as be- 
fore. It will be noted that the total 
range of Sg's from the bottom of the F- to 
the top of the A+ is quite close to 5 o! 
= 91.75, and that the ratio of the intervals 
from the top of the A+ to the "middle C", 
and from the "middle C" to the bottom of 
the F- is quite close to 0.91, the 
Sk. In. 96 « 
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The effect of considering the skewness 
in this test, and using the mode rather than 
the mean for the “middle C" grade, has been 
to raise slightly all of the Ss's for the 
array, the extreme scores being affected 
more than those near the center of the dis- 
tribution. In addition, the rather desir- 
able result is obtained of having the top of 
the A+ interval brought much closer to the 
maximum possible Sg on the test than is the 
case when the skewness is neglected. This | 
result is brought about in a large propor- 
tion, if not a majority, of the cases which 
the writer has encountered, though the very 
close agreement in the present illustration | 
{s rather fortuitous. 

The writer has adopted a practice | 


| 








which still further diminishes the labor in- 
volved in reporting test grades to his 
classes in that it obviates the necessity of 
preparing separate lists of grades for post-| 
ine. In addition, it has had a stimlating 
effect upon the students in promoting a 
nealthful rivalry between the members of 

each laboratory section, and between the 


different sections as a whole. At the be- 


ginning of the semester, each student is 
assigned a serial number which identifies 


the section of which he is a member as well | 
as his position in the section. Thus the | 
first member in Section 1 is given the num 
ber 101, the second member, 102, etc. When 
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a test is given and scored, the last one or 
two digits of the student's number is in- 
serted in the correct "histogram block" in- 
stead of using a tally mark, or “blocking 
in" as shown in the figures. Different col- 
ored pencils are used to distinguish the 
sections. 

When the Histogram Form is completely 


| filled out, and the lines ruled across be- 


tween the thirds of letter-grades as shown 
in the figures, it is posted on the (locked) 
bulletin board. Each student, by locating 
his serial number in the histogram, can 
quickly determine his empirical score, his 
letter-grade, and by interpolation, his 
standard score in the test. In addition, 
he can see what all the other students have 
done on the test and estimate his achieve- 


| ment compared to the class as a whole. With 


the histogram before him, and with the 
knowledge that the assignment of standard 
scores is the result of an objective appli- 
cation of mathematics which treats all the 
students alike, he is much less likely to 
feel that he has been discriminated against 
if his score is not a good one. In the 
past several years, since adopting this 
procedure, out of over a thousand freshman 
students passing through the writer's 
classes, scarcely a dozen have ever seri- 
ously questioned the fairness of the grade 
he received. 
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A METHOD OF PROVIDING A MORE VALID DISTRIBUTION OF SCHOOL MARKS 


by 


R. W. Edmiston 
Miami University 


INTRODUCTION 


The school mark which remains upon the 
permanent record should be as valid an es- 
timate of the achievement recorded as_ the 
teacher can possibly provide. Improved 
measures offer means of providing more ex- 
act evaluation. It is hoped that a more 
scientific professional training will elim 
inate any tendency for the teacher to con- 
sider other than achievement in arithmetic 
when determining the mark in arithmetic. 

The distribution of marks according 
to percents said to be derived from the 
normal curve is familiar to the educational 
profession. While the percents used in 
the distribution differ, a common one of- 
fers 10 percent of the number of pupils 
the highest mark, A, 20 percent B's, 

40 percent C's, 20 percent D's, and 

10 percent F's. This particular distribu- 
tion will be designated as the normal curve 
method in the further discussion. Small 

or selective schools and classes do not 
provide normal groups. The fact that a 
larger group attains the norm on a test 
does not assure a normal distribution. Only 
carefully standardized tests provide data 
for direct location of marks with respect 
to some norm. These tests are available 

in some elementary school subjects. Since 
many marks are based upon the results of 
informal tests, a substitute for the normal 
curve method of distribution is desirable. 
This substitute will be called the "Stand- 
ard Deviation Method of Distribution of 
Marks"; it is an attempt to apply statis- 
tical knowledge more correctly and could be 
designated "Improving the Use of the Normal 
Curve in Marking." 








STATISTICAL CONSIDERATIONS 


Since the divergence from the mean 
score and not by number of scores is the 
measure desired, the standard deviation dis- 
tances from the mean is a more direct basis 
for marks than the corresponding percents 
of a normal group. Slight changes of the 
percents previously given are made to con- 
form to standard deviation distances.? Thus 
the 10 percent A's become scores 1.30 or 
further above the mean; the 21 percent B's 
are scores between .5 and 1.30 above the 
mean; the Cts or 38 percent adjacent to 
the mean are scores including -.50 to .50; 
the 21 percent D's are scores between 
-1.30 and -.50 below the mean; the 10 per- 
cent F's are scores -1.30 or further below 
the mean. The test scores of any group may 
be converted into comparable form by apply- 
ing to each set of scores the formula, 


Cesc = 50 + 10 SM) 


where S is the individual's score, M the 
group's mean score, and o the standard de- 
viation. The mean of each converted set of 
test scores will be 5O and the standard de- 
viation will be 10. Thus 1.30 or more above 
the mean becomes 63 and above; similarly 
converted scores of 51 to 62, 45 to SO, 

38 to 44, and 37 and below represent B, C, 
D, and F marks, respectively. 

A beginning teacher has no standards 
from previous pupils' work upon which to 
base an estimate of the location of the 
mean of her present group's achievement. 
After using a test with several groups, the 
mean to be assigned to a normal group can 
be estimated. All marks will then be 





1. These divisions could be shifted at will and the method described does not remove the need of studies to determine 
more scientific criteria for the different school marks. 
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computed from this mean rather than the 
mean of the group considered. Otherwise, 
the system described does not provide for 
scation of the mean as a mark of relative 
-roup ability. If from the averare of a 
rroup's intelligence scores or former 
.chievement test scores their mean is esti- 
mated to be, for example -.2, of a standard 
ieviation from the normal mean, all marks 
san be determined according to the above 
provisions after the mean has been corrected, 
y adding .2 of the standard deviation.} 


Computation According to This Plan 
SAMPLE COMPUTATION IN ORDINARY SITUATION 

WHERE MEAN IS NOT CONCERNED 

| | | ] 


| | Converted Score 














d= a* | C-se = 50 
Pupil Test | devietion | devietion | , 20(s-m) 
Score | from mean | squared =§ Mark | ° 
+~- + ——+—-— 
l 30 | 35 1225 r | 32 
2 se | 3 Logg ro 35 
z 40 | 25 625 D 37 
4 2 | 23 529 it 38 
44 | 21 441 D 39 
6 45 | 20 | 400 D 40 
7 45 20 490 D 40 
8 51 | 14 196 D 43 
9 53 2 144 D 44 
uC 55 1 | loo | Cc | 45 
1 | 5 | 6 36 c | 47 
65 | Meen | 
1? 66 1 1 c 51 
1g 68 | 3 9 re de 
4 nm | 6 sfc 53 
is | os 10 | 100 | B | 55 
16 76 | ll 121 5 56 
17 77 | 12 14a | OB OCS 56 
18 77 wus | 144 | B | 56 
19 81 16 6 | 0286] Ci | 58 
20 | 81 | 16 | 256 | B | 58 
21 82 i 389 B 59 
22 88 23 529 F 62 
2% 90 | 25 | oe | A 63 
2 | 88 | 28 704—~C*SLSC‘Ca( 64 
2 | or | se +m "Te neon 
N-25 E=1618 | = =9508 
st 
d = test score - mean, often written S - M. 
_ Sum of test scores £5 
M = “number of pupils * N 
_ 1618 


is 
" 


25 64.72 or 65. N = No. of pupils 


= 25. &£= Summation or Sum of. 
d = distance each score is from 65, the mean. 
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Standard deviation or 


ea (S50-1E = 19.8 

= =.) = Zz: 

VN \ 25 380.12 9.§ 
co = .5 x 19.5 = 9.75 or 10. 

l.do = 1.3 x 19.5 = 25.35 or 25. 

From -.50 to .So from mean = scores from 


(65 = 10) to (65 + 9) or scores of 55 to 
74.2 These scores are given marks of C. 
From -.50 to -1.30 10 or 15 figures 
below 55 or 40 to 54, which scores are 
marked D. Scores lower than -1.30, or 40, 
are marked F, From .5o0 to 1.30 = 25 = 10 
or 15 units above 74 or 75 to 89. Scores 
in this interval are marked RB. Scores 


- © 
=> oo = 


| above 1.30, or 89, are marked A. 


If it were known that the mean of this 
group's scores was -.2 standard deviation 
below the mean of a normal group, the mean 
used should have been increased by .2 x 19.5 
= 3.9 = approximately 4. This would re- 
quire that the mean be raised from 65 to 


| 69 and all marks computed from 69 as the 


|; mean. 


Except in the case of fairly small 


| groups or special schools with selective 


| percentages of a normal distribution. 





factors, the group mean approaches that of 
a normal group rather closely. 

Note that the percentages at the vari- 
ous marks do not arree with the determined 
here 
A's rather than 10 percent, 
32 percent B's rather than 20 percent, 

20 percent C's rather than 40 percent, 

28 percent D's rather than 20 percent, 

and 8 percent F's rather than 10 percent. 
It should be noted that no one needs to fail 
by this system of distributing marks. When 
no member of the class falls lower than 
-1.3 standard deviations from the mean, 
there are no failures. 

The data in the table on the following 
page show two injustices of normal curve 
markine. 

(1) In the two schools, the exception- 
al (according to the intelligence scores) 
group in one school has received the same 
average mark as the average (according to 
the intelligence scores) group in the other. 
There is large divergence in the persomel 
of these two schools as shown by the aver- 
age intelligence of the two groups in the 
trigonometry classes. Both of these groups 


are 12 percent 





L. A similer adjustment of the mean could be made when using the normel curve method. 
2. Since 65 is first whole number cbove the actual mean, 64.72, 75 is not included. 
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EXAMPLES OF THE INJUSTICES PROVIDED BY THE NORMAL 
CURVE DISTRIBUTION OF MARKS AND THEIR CORRECTION 
BY THE STANDARD DEVIATION DISTRIBUTION 


The data in this table result from the application of 
the normal curve marking in Trigonometry, an elective 
course, in two city schools where the average school mark 
for entrance to the trigonometry class was 85. 














| ‘Merk | Mark 
Pupil | Av.Sch.*| in |} Pupit| Av.Sch in 
Wo. |1.Q.2) Mark |Trig.| No. {1.¢."| mark | Trig. 
t — ++ ———— 
1 | 156 95 A 1 106 85 Cc 
2 | 120 95 B 2 | 19 87 c 
s | 133 92 c s | 146 95 A 
4 | 126| 87 c 4 | 107 87 D 
5 | ize] 85 pfs |in 80° + 
6 | 135 90 b 6 | 110 85 D 
7 | ne 90 c 7 | 107 87 c 
8 | 141 90 B 8 | 115 95 B 
9 | 121 90 . 9 | 116 90 B 
10 | 187 87 D 10 | 107 90 c 
i | izi| 85 cf mu /im} 9 | B 
1g | 126 95 A lz | 12 92 c 
13 | 124 87 c 13 | 106 87 c 
14 | 182 90 B 14 | 121 90 c 
is | 19 es | D | “7% 
16 | 129 a7 | c 
17 | 129 85 c 
18 | 126 95 B 
19 | lee 87 c 
20 | 128 oe eae ea eee | t e 
21 | 188 92 c 
22 | 128 85 D 
23 | 121 87 D 
24 | 128 87 F 
25 | 127 90 c 
26 | 125 92 B 2 ibis Gaile 
eS | a Renee 
av. | ize{ 9 | c [us] 90 c 


























are in the upper third according to the 
achievement in their respective schools. 
There is a difference of 14 points in the 
average I.Q. and very little overlapping of 
individual I.Q.'s between the groups. This 
situation offers little assurance to the 
colleges who welcome the upper third and 
exclude the lower third. 

(2) In the larger class there is lit- 
tle indication that any pupil should fail. 
One pupil with a general average of 90 
fails trigonometry. Ridiculous! 

The following computations will show 
rer these pupils should have been marked. 

A different group in a different school was 


‘1. Same test used. 
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used because all marking had followed the 
normal curve method and the scores from 
which the marks were derived were available. 


THE MARKS BY THE STANDARD DEVIATION 
METHOD ARE OFFERED FOR COMPARISON 














== 7 “a 2) wa 
| C-se. Av. | Stand. 
| \Normal | of tests in | Devi- 
{Curve | course from | ation 
Pupil Av.Sch., Course | which mark | |, | Course 
No. |I.Q.| Marks Mark | was taken 4d | a@ = 
} + + + — 7 
1 | 148| B+ a 7 | 20/400] 4 
2 | 126) B+ ae 46 | -4| 16 | 2 
8 | 123) A a 60 | 10 |100| 4 
4 |e) 4 A 65 | 15 ‘228 A 
6 jue} Boj Dp | 59 Hi, -l1 {ra | =: 
6 | 118) B- rF | 26 /-14 | 196 zx 
7 |117| A BC 63 | 18 |le69| & 
8 | 116| B+ c | 42 | 86} 6] c 
9 jis); B | D | 39 |-11 121) ¢ 
wo jis} a | c | as | -5 | 25] 8 
i | 1s]; B a 51 1]; 1! B 
12 | 112| Bt Sey 58 | 293} 2 
13 | 111| B+ c | 45 -5 | 25| B 
14 | 111| B a.i4 46 | 4 | 16] B 
15 | 110| B 6 | 44 ~~ | 36/ C 
ie | 1091 B B | 57 | 7| 49| 2 
Avereg] 117) B+ | Cc | _ iB BA ke PB 
Normal| 110| c c. | mi mata 











Since the normal curve method of mark- 
ing had been used to obtain the marks in 
the required courses from which the average 
school mark was computed, the group's aver- 
age mark, B+, was 1.5 letters above the C 
average or in the upper one-half of the 
20 percent marked B. The + marks were 
not given on the records but are the result 
of averaging the recorded marks. 

Since this group's average mark, Bt, 
places them above the middle of the 20 per- 
cent of the scores between .5 and 1.3 
standard deviations above the mean, their 
achievement is approximately .2 standard 
deviation less than 1.3c¢ from the mean or 
l.lo above the mean.* Correcting their 


marks according to the mean of their achieve- 


ment would necessitate placing their mean 
at l.lo above the mean. Since their aver- 
age course mark is 50, this 5O is l. lo 








2. Av. Sch. Marks are taken in required subjects where at least 100 pupils were entered under the same teacher. 
pia gm ple pgs ie ng pe ae fre ppt ed ene ee 
+ There are 1.50 - .50 or .80 in the 20 percent of this normal curve where the mark 5 was given. Of this .80, the up- 
per .40, is the B+. The middle of this upper .40 is .20 from the upper limit or 1.30. Therefore, 1.30 - .20 or l.lo 


above the mean is the average achievement of this group. The group's mean is therefore 1.10 or 1.1 x 10 = 11 points 
above the mean of a normal group. 
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above the mean for a normal group. The 
standard deviation of their course numerical 
marks is 10. Therefore their mean of SO is 
1.1 x 10 or 11 above the ordinary mean or 
the ordinary mean is 50 - 1l or 39. One 
score is below this corrected average but 
within .50 of same and therefore receives 
aC. Including .5 of 10, (the standard de- 
viation) or 5 scores above the mean shows 
39, 40, 41, 42, 43, and 44 deserving of 
marks of C, proceeding to 1.30 or to 1.3 

of 10 = 13 points above the mean (13 - 5 

= 8 more scores), 45, 46, 47, 48, 49, 50, 
and 51 should receive marks of B, and 
those above score 51, marks of A. The 
marks designated are provided in the last 
column of the above table and differ widely 
from those provided by the normal curve 
method. 


SUMMARY AND CONCLUSIONS 


(1) Every precaution should be taken 
to make the school mark a valid estimte of 





R. W. Edmiston 


the achievement represented. 

(2) The common normal curve method of 
distributing marks results in injustices 
except for normal groups. Many school 
groups are not normal groups. Therefore, 
both the mean score or general group 
achievement and the individual scores are 
improperly placed by the normal curve meth- 
od of distribution. 

(3) The standard deviation method of 
distribution of marks provides for the cor- 
rect distribution which the determined per- 
centages in the normal curve method of dis- 
tribution denies. A correction can be ap- 
plied to the mean which will take care of 
the group's difference from that of a nor- 
mal group. Failure is not necessitated by 
this method of distributing marks. Actual 
application of the two methods shows the 
superiority of the standard deviation 
method in that the marks provided are much 
more in agreement with the general achieve- 
ment record, 
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THE PRACTICAL STATISTICS OF PREDICTION 


by 


Charles H. West 
Industrial Commission 
Wisconsin 


Madison, 


The purpose of this paper is to report 
a three-year prediction experiment, making 
particular reference to the practical sta- 
tistical problems which were involved. Of 
necessity the experiment was of an explora- 
tory nature. No similar study has been pre- 
sented in the literature. The attention of 
previous writers has been directed to tests 
or other predictive measures, and their rel- 
ative value and how they might be used. Or 
by a study of individual or group case his- 
tories, they point out the psychological 
values of prediction. But the actual me- 
chanical processes of computation, predic- 
tion, and evaluating predictions statisti- 
cally have been somewhat neglected. This 
report will be concerned only with the sta- 
tistical aspects of a practical prediction 
situation. 

The subjects of the experiment were 
731 students enrolled in College Algebra at 
the University of Wisconsin, 279 who com- 
pleted the semester course in 1932, 219 in 
1933, and 233 in 1934. The object was to 
predict as well as possible the grades 
these students would receive. The grades 
were given as letters, A, B, C, D, and F, 
to which, for purposes of arithmetical com- 
putation were assigned arbitrarily the nu- 
merical values 4, 3, 2, 1, and 0, respec- 
tively. The question of the accuracy of 
these grades, or what they actually meas- 
ured, is of no consequence to this experi- 
ment. 

To predict these grades there were 
available: 1) the rank in class in high 
school, expressed as a percentage; 2) the 
percentile rank on a psychological examina- 
tion; and 3) the numerical mark (range 0 
- 100) on an algebra placement examination. 
The first two were part of the record for 





| 


each student upon entrance, and the place- 
ment examination was given, at the begin- 
ning of the term, to those enrolled in Col- 
lege Algebra. To avoid an excessive burden 
of computation, the predictive measures 
were reduced to one digit, by dropping the 
unit's place, and recording all test marks 
or percentiles in the range from 0 to 9, 
The grades for the 1922 group could 
not, in the natural course of events, be 
predicted. The data of the first year 
formed the basis for deriving the first pre- 
diction formula, to be used in predicting 
grades in subsequent years. This is an es- 
sential part of prediction work, but until 


| the forma has been applied, no prediction 





has been accomplished. Analysis of the 


| 1932 data showed that the psychological ex- 


amination percentile rank was not of suffi- 
cient value in addition to the other’ two 
measures to warrant its inclusion in a for- 
mula. Seldom in educational prognosis does 
the use of more than two predictive meas- 
ures yield a satisfactory return for the 
additional labor involved. 

Essential data are summarized in 
Table I. The formula was derived using 
three variables,--the crades, referred to 
by the subscript "1"; placement examination 
marks, referred to by the subscript "2"; 
and the hi¢rh school percentile rank, re- 
ferred to by the subscript "3". Although 
the arithmetic was simplified wherever pos- 
sible, the computations were made with 
great care. Many steps are involved in the 
process, several of them subtractions, and 
in taking these steps some significant fig- 
ures are likely to be lost. For this rea- 
son, two variations in the method of com 
puting a regression coefficient may lead to 
results which do not agree exactly. To avoid 


easel 
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this difficulty, the writer began with aver- | adopted, and the student was reported as 


ares computed to six significant ficures, | 13th in a class of 50, making his percentile 
in order to arrive at regression coeffi- rank 74, which would have been recorded as 7, 
-fents correct in three places. Moreover, | These values, substituted in the formula, 
111 computations were duplicated in order vive the predicted grade as 2.23. To sim- 


to make the results as accurate as possible. plify the use of the formula for predicting 












TABLE I 






SUMMARIZED DATA FOR COMPUTATION OF ome 
FORMULAS AND STANDARD ERRORS OF ESTIMATE 








1932+1933 
1932 1933 Combined 1954 
Data Data Data Data 

































2.15978 2.16438 2.15060 2.15880 
MEanS covcccoceses cocccee Mg 6.11828 6.10502 6.11245 6. 02146 
6.51182 6.26484 6.29116 -62661 












1. 71 1.33119 1.54561 
Standard Deviations ..... OG, 1. 1s oe 1.98232 1.68724 2. 17368 
2.54356 2.55729 















| 68754 - 65466 
Intercorrelations ...e.++ Tig eSSE28 51852 53754 52891 
-40242 





Regression Coefficients: 





‘ -4646 0694 49 2536 
a) Standard Form .... Piz.a he aan as 5361 


Bis.2 | - 5936 22009 5367 5578 








b) Score Form ceccccee 


Number of Subjects ...e.- 


Standard Errors of 
Estimate .ccccccccces 83,93 -963 921 -954 -894 















Regression Coefficients of 
Modified Formulas: 





a) To Predict 1935 die. | _ 318 om —_ ae 
GradeS cccccces Dis.e e210 --- --- one 
c -1.12 








b) To Predict 1934 Sis. -290 349 .309 ose . 
Grades eeeeeeee Dis.ee e252 -145 -197 ——_ 
c -1.15 -0.90 -1.02 — . 

















The formula derived gave the predicted | 1933 grades, a table was made, showing the 
grade in terms of the placement examination | predicted grade for each of the hundred 
mark and the high school rank as possible pairs of values of X, and X,. Then 
; all predictions could be quickly read from 
i Xi = .446X2 + .208X5 = 1.90. this table. Such a table would be cumber- 
3 some if scores were used on the whole range 
4 Suppose, for example, the placement mark of 100 points. In this case, a nomogram or 
were 67, or 6 in the abbreviated form prediction chart could easily be constructed 
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to save the labor of computing each value 
arithmetically. In the present case, a 
further simplication was made to lighten 
the heavy burden of computation. Predicted 
grades in the table were rounded to the 
nearest .4. Thus there were only 15 dif- 
ferent values of the predicted grades, mak- 
ing easier the construction of scatter di- 
agrams of predicted and actual grades and 
the computation of the standard error of 
estimate. 

The standard error of estimate is com- 
puted from paired values of predicted and 
actual grades as follows: subtract pre- 
dicted from actual grades; square these dif 
ferences; add squared differences; divide 
this sum by the number in the group to give 
the average squared difference. The square 
root of this average is the standard error 
of estimate. The value computed in comec- 
tion with the derivation of a formula is 
never computed in this way, since no pre- 
dictions have been made. The condition 
which governs the choice of the regression 
coefficients is that the standard error of 
estimate (or more properly, its square), be 
reduced to a minimum, The mathematics 
which accomplishes this will readily give 
also the minimum value. One formula is 


— = o3(1 - BiaesT,, ~ Basie Ty5)> 


The value obtained in this fashion for 1932 
was .963. 

Why compute this value if it does not 
refer to any predictions which have been 
made? In this, as in much statistical 
work, the assumption is that successive 
groups can be treated as random samples 
from a hypothetical "population" or "uni- 
verse" which possesses constant character- 
istics. This condition may not always be 
met, especially in educational data, where 
test editions vary widely, and personal 
elements enter. But, working on this as- 
sumption, the value .963, obtained from the 
1932 data, is the basis for the only avail- 
able estimate of the "true" or “population” 
value, and may be called the "expected" 
value, 
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After 1933 grades were predicted the 
standard error of estimate was computed, as 
suggested above, from the deviations of ac- 
tual grades from predicted grades. It is 
this value which alone deserves the name 
"standard error of estimate”, as it is the 
only one which gives an expression of the 
accuracy of actual predictions. The value 
obtained was .957. As an isolated statis- 
tic this value is of practical significance, 
indicating the size of the errors which have 
been made in the predictions. 

Statistically, it is of further inter- 
est when compared with the standard devia- 
tion of the 1933 grades, 1.331. The ob- 
tained value, .957, is 71.9 percent of it, 
so that the predictive efficiency, 100.0 
minus 71.9, was 28.1 percent. In other 
words, the use of the formula was 28.1 per- 
cent "better than a guess." "Guessing" 
merely means predicting everybody at the 
average, which is thereby not guessing, but 
the best estimate that could be made, know- 
ing only the 1932 grades. In the latter 
case, all would have to be predicted at 
2.14, the 1932 mean, and 1.331 would be the 
standard error of estimate. The fact that 
the 1933 mean was 2.16, or .02 more, would 
increase the error by less than .00l, 
though a marked change in the mean would 
cause a more noticeable disturbance, 

But another comparison could well be 
made. When the 1933 data were used to com- 
pute a formula, which was thereby the one 
which would best predict the 1933 grades, 
the standard error of estimate was found to 
be .921. This smaller figure was due in 
part to the smaller standard deviation of 
1933 grades, but also to a closer relation 
between predictive measures and grades. In 


_ order to obtain a smaller value than .921, 





measures bearing closer relationship to 
grades would have been required. The value 
-921 indicates a predictive efficiency of 
30.8 percent. With the relations as they 
were, this value may be considered as the 
limit of efficiency of prediction. 

Could not the predictions have been 
improved by some modification of the for- 
mula which could have been made before the 





1. Correction for the number of degrees of freedom used in fitting, which should be used in more exact work, has been a 
omitted here. The correction would increase the "expected" values by less than .5 percent. 
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srades being predicted were known? To pre- 
4ict 1933 grades with 30.8 percent effi- 
stency, 1933 data would have to be used 
throughout the computation of the formula. 
Though complete 1933 data were not available 
until the grades were known, the means and 
standard deviations of the predictive meas- 
ures were known, and might be used in place 
of the corresponding 1932 data in deriving | 
the formula. In Table I, the regression co- 
efficients are given in the "standard" form. 
To reduce them to the “score” form, the 
formulas are: 





Oy 
Dis.s = —Ffazas 

Oe 

0 
Dis.3 - —Bis.e2 


o3 


C = mM, — Dis.s Me — Dis.2 Mae 


The modification consists merely of using 
1933 values of o2, O53, Mz, and m, instead 
of those for 1932. For example, f,.2,; was 
.4646, and the regular form for b,,,, was 


1.3567 
1.4131 x .4646, or .446. 





To obtain the modified form, 1.4131 was re- 
placed by 1.9823, giving for b,,., the value 
318, 

This modification might be expected to 
correct errors which would enter in case, 
for example, the new edition of the place- 
ment examination were easier, with resulting 
higher marks. An unmodified formula would 
then predict grades too high, whereas the 
modification would at least give the assur- 
ance that the mean of the predicted grades 
would be the same as the 1932 mean. The 
assumption is that average grades are more 
stable than the averages of the predictive 
measures. Errors to be corrected, such as 
that arising from a change in the difficulty 
of a test, are of a constant nature. in 
small groups where the sampling variability 
begins to assume the same proportions as the 
constant error, the modification could hard- 
ly be expected to be of value. Even in 
small groups, however, it may be found by 
experience that teachers do not vary the 






| 
| 
} 


| 


follows: 
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average grades in full proportion to the 
variations in ability which do occur. Wher- 
ever this is true, undesirable as the sit- 
uation may be, the condition required for 
the use of the proposed modification is 
fulfilled. 

The modified form of the 1932 formula 
was 


X, = -318X, + 210X, = 1s Os 
A table, similar to the one used previously, 


was made, and the 1933 grades were again 
predicted. In this case the standard error 


| of estimate was .947, a figure slightly 


less than the .957 obtained with the first 
formula, and yielding 28.9 percent effi- 
ciency, compared with the 28.1 percent ob- 
tained previously. It would appear that 
this modification was worth while. 

To test the innovation further, the ex- 
periment was continued to the 1934 grades, 
which were predicted from the regular and 
modified formulas derived from both the 
1932 and 1933 sets of data, using tables 
giving the predictions to the nearest .4 as 
before. The standard errors of estimate, 
which are summarized in Table II, were as 


1932 formula, regular form ..... .936 
modified form .... .916 


1933 formula, regular form ..... .908 
modified form .... .913. 


Again the modification of the 1932 formula 
showed an advantage over the regular form, 
The slight disadvantage shown in the case 
of the modified 1933 formula was given fur- 
ther study. An examination of the table 
showed that the rounding to the nearest .4 
may have been especially unfavorable in 
this case. To investigate the point, and 
at the same time to show how much error was 
incurred by the rounding, both 1933 tables 
were recomputed to the nearest .1, and the 
predictions repeated. The standard errors 
of estimate were found to be: 


1933 formula, regular form ..... .913 
modified form .... .907. 
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The difference is still small, but is now 
in favor of the modified formula. The in- 
dication is, moreover, that the error in- 
troduced by the rounding is a relatively 
: small one in comparison to the advantage 
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| and modified forms. 

| as before, and the 1934 grades were again 

| predicted, with the following standard er- 
rors of estimate: 
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Tables were computed 








shown, for example, by the modified 1932 Combined formula, regular form... .923 
. formula over the regular one. modified form .. .898, 
TABLE II 

i 

y STANDARD DEVIATIONS OF GRADES COMPARED WITH 

: STANDARD ERRORS OF ESTIMATE FOR ALL PREDICTIONS: 

4 EXPECTED, OBSERVED, AND MINIMUM VALUES 

_ —$____ ST UEEEEA SERRE eee ee 2 
: 

3} | Standard 
| Deviation Standard Errors of Estimate 
: of Urades Observed 
Being ' Regular Moaified 

i, Prediction | Predicted Expected Formula Formula Minimum 

—— ———EEEeee - - . + — . = 

| 1935 grades from 1952 
i LORMALA -ccccccccecercece 1.331 965 957 -947 921 
f 1934 grades from 1932 
j SEE sescetshaaneene 1.315 .963 936 916 -894 

t | 1954 grades from 1983 

| formula .cccosccccccece 1.315 -921 . 908 -913 894 

ca | 1954 grades from 1933 

; formula (nearest .1) .. 1.415 921 915 907 894 

: 1934 grades from combined 

1932 + 1933 formula ... 923 -898 894 


1.315 


954 





> | | 
i — = - —— 
‘ 
oe | 


It will be noted that all of the for- 
mulas yielded a smaller standard error of 
estimate than that obtained in the course of 
computing them, i.e., the "expected" values 
.963 for 1932 and .921 for 1933. But the 
complete 1934 data yield a standard error 
of estimate of only .894, a figure again 
smaller than for previous years, but not 
surprising, because of the size of the stand 
ard deviation of the grades, which was 
1.315, also the smallest for three years. 

\ The minimum figure, .894, represents a max- 
| imum predictive efficiency of 32.0 percent. 
The regular 1932 formula yielded 28.8 per- 
; cent efficiency, while the modification in- 
: creased that to 30.3 percent. The various 
| forms of the 1933 formula gave 30.6 to 
ia 31.1 percent efficiency, but there was no 
way of knowing before the predictions were 
made that it would be superior to the 1922 
formula, 

A further attempt was made to secure a 
more efficient prediction of the 1934 grades. 
Data from 1932 and 1933 were pooled, and a 





t 
} 
4 single formula derived, in both the regular 


| Thus the regular form was 29.8 percent ef- 
| ficient, and the modified form, 31.7 per- 


cent, compared with the maximum of 32.0. 

Unfortunately, the investigation does not 

reach far enough to give assurance that the 

combination will prove superior in most 

cases, but such is very likely to be the 

case. It certainly cannot be expected that 

a formula derived from data of a sinrle 

group will continue, year after year, to 

yield predictions near maximum efficiency, - 

without some modification. & 
An examination of the data for the 

three years, as civen in Table I, does not 

warrant the abandonment of the assumption 

of random sampling. Except for o,, none of 

the means or standard deviations vary more 

than would be expected by reason of chance, 

or “sampling”, variation. The variations 

in o, alone are too large to be explained 

in this way. Since this variation is in 

one of the predictive measures, its effect 

can be corrected, in part at least by the 

modification which has been used. 
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CONCLUSIONS 


1) Very good predictions were obtained 
in spite of reducing the predictive meas- 
ures to one digit. 

2) There was some evidence to show that 
rounding the predicted grades so that there 
would be only 15 or less different values 
did not cause serious disturbance to the 
value of the predictions. 

3) Variations in most of the data from 
year to year were not large enough, for sam- | 
ples of this size, to contradict the condi- | 
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tions of random sampling. 

4) The modification of prediction for- 
mulas, by substituting the means and stand- 
ard deviations of the predictive measures 
for the new group being predicted was found 
to give worth-while improvement in predic- 
tive efficiency. 

5) Judging from the single case tested, 
an apparent advantage in favor of pooling 
data from successive groups merits at least 


| some consideration. 
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PREDICTING THE RETURNS FROM QUESTIONNAIRES: 
A STUDY IN THE UTILIZATION OF QUALITATIVE DATA 
by 
Herbert A. Toops 


Ohio State 


Through the kindness of Dr. William G. 
Carr of the National Education Association 
the writer was supplied with the data re- 
garding the "traits" (tests) and the per- 
centage return (criterion) on some 135 dif- 
ferent questionnaire investigations which 
were collected by the N.E.A. in their ef- 
fort to formulate principles of question- 
naire construction and promulgation. In 
the published report! certain principles of 
questionnaire construction, in order to se- 
cure a maximal return from the question- 
naires distributed, were elucidated. These 
arguments were al] based upon the observed 
zero-order correlation coefficients or rath- 
er scatter-diagrams between the variables in 
question (X) and the percentage return (Y). 
Accordingly, it readily occurs to one that 
perhaps if multiple regression equations 
were employed some of the conclusions of 
that report might be altered considerably. 
But since many of the “variables” regarding 
the questionnaires were qualitative vari- 
ables--e.¢., form of reproduction of ques- 
tionnaire: "“mimeographed", “printed”, 
"typewritten",--such variables could not be 
employed in the customary multiple regres- 
sion procedures. The author, being much in- 
terested in determining what were "the most 
important considerations in quustionnaire 
formulation in order to get a maximal re- 
turn" for the purposes of a forthcoming 
reference book on questionnaire construc- 
tion,” appealed to Dr. Carr for a copy of 
the original data, as above stated, in the 
hope that a method of handling these quali- 
tative variables could be worked out. 

The nature of the data may be inferred 





University 


from Table I, in which the “answers” for 
questionnaires Nos. 012 and 014 are given, 
together with the corresponding originally 
coded scores which were punched into Hol- 
lerith cards for the production of the 
"validity correlation plots." In this cod- 
ing process each different answer was given 
its own code number; certain rough classi- 
fications, or judgments similarly cate- 
gorized, having already been ascribed to 
certain ones of the original data by the 
N.E.A. as for example in variables l, 5, 
10, 11, 12, and 13. Three of the variables 
were derived by the writer, namely variables 
4, 6, 16, from the data supplied by the 
N.E.A. 

From the resulting plotted validity 
scatter-diagrams it was obvious that, tak- 
ing the scores as originally coded, many of 
the validities were practically zero or 
even negative. There were three obvious 
difficulties:- (1) An alphabetical order 
of the categories of a qualitative variable 
is not necessarily the correct sequential 
order of the categories on a "quantitative 
scale"; (2) some of the quantitative vari- 
ables had evident tendencies toward curvi- 
linearity of regression; (3) the com- 
pounded-answer variables yielded a "valid- 
ity plot" of meaningless complexity. 

Both of the first two difficulties may 
be readily overcome by so transmiting the 
scores as to rectify the regression of Y on 
X. We canrectify the regression line of a 
criterion variable upon a test variable by 
the following simple principle:- To each 
category (or quantitative score) of a vari- 
able ascribe a transmuted score which is 











1. N.E.A. Research Division—-The Questionnaire. Research Bulletin of the National Education Association, Vol. 8, No. l, 


Jan. 1950, 51 pp. 





2. Toops, Herbert A. Questionnaires, Standard Codes and Hollerith Machines. To be published shortly. 
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TABLE I 


R TWO 





QUESTIONNAIRES, NOS. 012 AND 014 




























1 Source of Questionnaire 
State of Origin 
2 (Address of question- 
__maire Sender) _ 
_& Sex of Author 
' Is author's name in- 
4 cluded in "American 
_Leaders in Education"? 








Subject classification 
of the questionnaire 


Length of time in tenths 


of a year since the be- 
6 ginning of the school 
year, when the question- 
naire was issued 
Form of reproduction 
of the questionnaire 
“Number of pages in 
the questionnaire 
Number of items 
requested 








Types of question asked — 
10 (including combination 
types) 


Was space for 


Vartable Number and Name 


—>— —<—<<<$—$$ —_—_—- > _ —— — —7 





"Answers" 
Questionnaires No.:- 


012 


———y 


Director of 
Industrial Arts 


Connecticut 
Male 
No 


Vocational 
Education 


Short answer 


































— a eee 


—_——-—--++ 





for 


O14 


Director of 
Research of 
the N.E.A. 


Washington, 
D. C. 
Male 


Yes 


Teachers' 
Salaries and 
Professional 
Status 


Printed 


4 pages 


33 
Yes and No; 
check, and 
short answer 





{(Variable 17) x 100] ¢ 
Variable 14. 








—— 
| Corresponding 
Originally 
Coded Scores 
for Correlation- 


Plotting 
012 014 

| 
2 a 

| ] 
04 06 

- + 
t «< + “ 
ee 2 

+ > 

| 

07 | Ol 

| 
3 5 

+ + 
l 2 

+ + 
l 4 

+— + 
008 033 


| 
answers ample? A Fe OR re Pr. : 
Types of material re- Statistics: bs 
12 quested (including com- Other Object. ace | 21 | 04 
____ bination types) __ Data, Opinion di — | 
Availability of Data . 
15 Requested sisal | Besetpae eh. a ee : : 
Number of copies of ° | 
14 questionnaires issued I al aw 0055 26680 
Was a tabulation of re- " P ised T 1 9 
sults furnished the N.E.A.? i ere eh a PE } 
Average number of items 
per page of the ques- 
16 tionnaire (Variable 9 ¢ 8 8 06 | 06 
_____ Variable 8) __ et * ss. = peer ae te 
Number of "usable" 
replies received 52 enced i Oh 1552 
Percentage of replies 
0 received (The Criterion) 58.2 53.2 | oss 
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proportional in size to the average depend- 
ent, (and associated) criterion score of 
the category (or quantitative score) in 
question. It will be noted that this prin- 
ciple maximalizes the zero-order validity 
coefficients; and, since some of the cate- 
gories have few frequencies, the reliabil- 
ity of this transmutation is low; and, fi- 
nally, the validities noted are, correspond- 
ingly, undoubtedly too optimistic. 

Accordingly, in all qualitative vari- 
ables, and in all quantitative ones where 
the regressions were markedly curvilinear, 
the means of the columns of the scatter-di- 
agrams were computed. 

















The Differences Technique. (For simple- 
categoried variables and curvilinear re- 
gressions.) 

On the qualitative variables, the re- 
sults were typified by the following:- 
Variable 7. 

Form of Reproduction Mean Criterion Score 








of Questionnaire (Percents of Returns) 
Mimeographed 68.6 
Printed: 57.4 
Typed 74.1 


It is clear that in this variable the cate- 
gories do not appear in the correct se- 
quence of the quantitative variable to which 
they may be reduced. By rearrangement we 
have:- 


Form of Reproduction Average Percents 


of Questionnaire of Returns 
Printed 57.4 
Mimeographed 68.6 
Typed 74.1 


Now it is clear that, by the above 
principle, we may rectify perfectly the 
above regression by ascribing to the three 
categories X' scores which are in exact 
proportion to the three averages, namely 


xt 
Printed 57.4 
Mimeographed 68.6 
Typed 74.1 
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However, these numbers are clumsy to 
deal with, and we have the further principle 
in coding that: Any score X' can be trans- 
formed by the equation X' = a + bx" (pro- 
vided b be positive) without changing either 
the marnitude or the sign of the resulting | 
intercorrelation coefficients of the vari- 
able X' in question. In other words we can 
multiply all of the above scores by any 
positive constant b (either mixed, fraction- 
al or decimal) and then add any constant a 
(either positive or negative) without af- 
fecting the correlation coefficients of the 
variable in quest* >on in its relation to 
other variables. This process can be done 
in such a way that, by slight shifts in the 
scores assigned, small integral numbers 
will result, and the regression will be al- 
most maximalized. In the above case, for 
example, the work may be arranged as fol- 
lows :- 


























Printed Mimeographed Typed 
Average Percent of Re- 
turns (X') 57.4 68.6 74.1 
Difference in Returns +11.2 +5.5 


Difference Divided by 
@ Common Denominator 
of the Difference +8 +1 
(5.60) 
Coded Scores, if 
"Printed" be As- 1 3 * 
signed e Score of 1 


In the above, a = +51.8 
b = 5.6 
and, X" = -51.8 + 5.6X! 
although, by the method (of differences) 
used, there is no need for the determination 
of the magnitudes of a and b? or for the 
statement of the transmutation equation. 


The Transmutation Table Technique. (An al- 
ternative method for simple-categoried 
variables and curvilinear regressions. ) 

In the case of variable 2, the state 
of origin (Address of questionnaire sender ) 
we did the following:- 

1. Upon an outline map of the United 
States we copied the several percentage re- 
turns individually of all the questionnaires 
originating in each of the respective states. 
Seventeen of the states had none; ten others 





ne i il 





1. It will be observed that where there are only two categories to a variable (e.g., "No" and "Yes") these may always 


be coded 1 and 2 without further ado. 
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nad 1, while twenty-one had from 2 tc 21, 
the last originating in the state of New 
york, Canada and the District of Columbia 
also had one each. 

2. The returns by states were averaged; 
whereupon it was noted that there are dis- 


tinct regional tendencies, for example, very 


low returns in the extreme west and very 
hich in the northcentral and northeast. The 
average return figure thus obtained for a 
state was allowed to represent a state in 


all cases where there were two or more ques- | 


tionnaires per state; also in all cases of 
one per state unless that figure was more 
than 15 percent removed from the modal fig- 
ure of that region when the modal figure of 
the region was substituted instead, such 
correction being necessary only for two 
states.} 

3. The resultant percentages were cod- 
ed by transmutation as shown in Table II. 


TABLE II 


Average Percentage 
of Return Per State 








52 
54 
56 
58 
60 
62 
64 
66 
68 
70 
72 
74 
76 
78 
80 
82 
84 


55 
55 
57 
59 
61 
63 
65 
67 
69 
71 
73 
75 
77 
79 
81 
83 
85 
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The final coding is shown in Table III. It 
will be noted that some 17 states are not 
represented at all in the list; and, for 
these states scores will need to be supplied 
arbitrarily, for the present, in using our 
rating scale proposed below in areas not 
covered by the original data of this study. 
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TABLE III 


A RELATIVE INDEX OF PROPORTIONATE RETURNS TO 
QUESTIONNAIRES TO BE EXPECTED, AS BASED 
UPON THE STATE OF ORIGIN OF THE SENDER 








x* 
Alabama 17 
California 2 
Colorado 8 
Connecticut i] 
Delaware 15 
Dist. of Columbia l 
Georgia 17 
Tllinois 5 
Indiana 1 
Iowa 8 
Kansas 11 
Kentucky 8 
Maryland 10 
Massachusetts 9 
Michigan 
Minnesota 
Missouri 
Montana 
Nebraska 
New Hampshire 
New Jersey 
New York 
North Carolina 
Ohio 
Oklahoma 
Oregon 
Pennsylvania 
Tennessee 
Texas 
Virginia 
Washington 
Wisconsin 
Canada 


| om 
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The above process is an alternative to 
the former and, like the former, makes a de- 
termination of the macnitudes of a and b un- 
necessary. 


Treatment of "Answer Patterns." 

In the case of variables 12 and 10, 
where the "answers" to the variables are 
compounds or patterns, each pattern was in- 
dividually coded but without yielding mean- 
ingful results when plotted as validity ta- 
bles. 

The original categories of "answer" to 
Variable 12, for example, were:- 








l. It will be noted thet this is done in sccord with the notion ubove that low-frequency-determined everages are less 


reliable than high-frequency-determined averages. 


The map procedure also gives a method of supplying scores for the 


seventeen states which had no questionnaire originating therein; and, ss such, is applicable to locations only. 
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Addend Category 


1 Statistics 

2 Lists of names (schools, etc.) 

a Other objective data 

8 Discussion of subjective opinion 
16 Opinion 


If now one will add the addends for any pat- 
tern of "response" one will arrive at a 
unique code number for that pettern. For 
the answer-pattern "statistics, other ob- 
jective data, and opinion" for example, the 
code number is 21. The resulting validity 
table [of 31 code numbers (X) against per- 
cent return (Y)] was, as might have been ex- 
pected, meaningless. Clearly we have here a| 
case of plotting in sixfold space the pos- 
sible combinations of 5 variables of two 
categories each--possession and non-posses- 
sion--and notins the resulting surface where | 
variable O is the dependent variable! Vari- 
ous expedients were tried. Finally, it be- 
came clear that the most important distinc- 
tion was between "objective" and "subjec- 
tive" replies and that the “complexity” of | 
reply demanded had something to do with the | 
matter. These distinctions could be made | 
by dividing the above categories into ob- | 
jective and subjective types as follows:- | 
Objective types (3 types) 
No objective data 
Statistics 
List of names (schools, etc.) 
Other objective types 
Subjective types (2 types) 
No subjective data 
Discussion of subjective opinion 
Opinion 
Accordingly, a questionnaire might have any- | 
where from 0 to 3 objective types of data 
requested, and, concurrently, anywhere from 
0 to 2 subjective types. Table IV was con- 
structed. If one will erect pins at the 
centre of the compartments of Table IV, 
proportional in height to the table entries 
therein he will note that a fairly smooth 
doubly warped surface results, if we read- 
just but one point, compartment 3-1 (Row 3; 
column 1). The resulting code numbers 
(figures in the triangles of Table IV) were 
determined then from the following transm- 
tation table (a 4 being supplied arbitrarily 
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TABLE IV 
PERCENTAGE OF RETURNS ON VARIABLE 12 


(The number in parentheses in the upper left-hand 
corner of a compartment is the number of questionnaires 
involved; the figure in the body is the average per- 
centage return; and the figure in the triangle is the 
code number finally decided upon.) 








for the percentace 22.0) :- 














Nuaber of Number of Subjective Types 
Ob jective ore, -r* — ] 2 a me 
Types 0 | 1 2 
a ee ee ee ee 
(8) (5) 
ie) --- 55.5 / 65.0 ~ 
| Za 10 
= len) Ta0) 
1 74.7 70.5 66.5 
AN AX) 12 
—— "=" a ——— 4 as = ee Ss = 
i“; (10) | (6) 
2 67.5 64.5 i: ¢ 
me ™ A 
SS a —_—+- 2 
(4) (2) | (2) 
3 58.2 22.0 46.5 
L LZ» A). [) 
: 
46.0 - 47.9 1 
48.0 - 49.9 2 
50.0 - 51.9 3 
52.0 = 53.9 4 


74.0 - 75.9 15 


The result, it will be seen from the table, 
is a series of very satisfactory code num- 
bers most of which are based on enough 
cases perhaps to be fairly reliable. In 
such tables the reliability of the coding 
inheres somewhat in the consistency of the 
network of values assigned. 

In variable 10, after trying succes- 
sively various combinations, the eventual 
distinction seemed to turn first upon simp- 
ly recorded answers (yes-and-no; and check- 
ing) as versus more complicated written an- 
swers (either short or long written answers, 
requiring more deliberation in determining 
what to write down as the answer); and sec- 
ond, upon the variety of answering 
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procedures demanded, complicated answers re- 
-eiving the better returns! For Variable 
, Table V was evolved. 


TABLE V 
PERCENTAGE OF RETURNS ON VARIABLE 10 


(The mumber in parentheses in the upper left-hand 
corner of a compartment is the number of questionnaires 
involved; the figure in the body is the average per- 
centage return; and the figure in the triangle is the 
code number finally decided upon.) 








Complicated written answer 
Complicated an- 
Simple 


swers (short and/ 











| 
Answer | No complicated|or long written 
Required | answers answers) 
: | 
No simple (38) 
67 
answers Ad 
(2) (64) 


Yes-and-No 





MES (4) (6) 
hec 63 71 
A 
Yes-and-No | (5) (21) 

and 64 58 

Check /» /\ 

All the variables having thus been 
coded and quantified, all intercorrela- 
tions, means and standard deviations were 
computed (Table VI, page 210). To these 
the multiple ratio technique was applied 
in order to determine, successively, the 
identity of the minimal 2, 3, 4 ... vari- 
able-composites which, optimally weighted, 
will maximally predict the return to be ex- 
pected of a questionnaire of a given pat- 
tern of (our test-measured) characteristics. 
See Table VII, page 211. 

The variable of highest zero-order 
validity is Variable 5, the subject-classi- 
fication of the questionnaire--whether the 
questionnaire deals with a topic which "in 


reneral is well replied to" by the persons 
questionnaired; and a measure of the re- 
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cipients' "interest" in the investigation-- 
and this variable in its own right corre- 


|} lated .4328 with the returns received. Em 
| Ploying one “test” only to predict the per- 
| centage return, we would use this one. 


If now we give standard scores in this 

| variable a weight of 1,0000, that one of all 

| the remaining sixteen variables which will 
|raise the prediction most is Variable 14,-- 
|a variable of almost equal importance as re- 
| vealed by its negative weight of -.9796,-- 
the number of copies of the questionnaire 
issued, and results in a multiple coeffi- 

| cient of .5609 for the two tests in combina- 
|tion, soweighted. As stated in the remarks, 
| this variable probably takes second place in 
| the regression equation for the reason that 

| it is, perhaps, the best indirect and (in- 
| 
| 


| 


verse)measure we have of the amount of fol- 
low-up effort expended on the questionnaires. 
Observation would indicate that very large 
mailing lists are frequently if not common- 
ly employed for the purpose of getting a 
"large enourh reply" without resorting to 
follow-up procedures. In a recent business 
questionnaire, for example, over 200,000 
questionnaires were distributed in order to 
obtain some 18 percent, or about 40,000 re- 
plies. Accordingly, this variable received 

a negative weight and boosts the multiple ra- 
tio coefficient markedly. No subsequent 
variable can add nearly so much. If we 

were to predict the returns by two varia- 
bles only we should employ these two, Vari- 
ables 5 and 14, for the purpose. The re- 
gression equation, 

Xo - Mo (1.00) Xs ~ Ms — .9796 Rig = My 


Oo Oo. c 


14 


has a validity of .5609 in predicting the 
expected returns, X,, to questionnaires for 
which these two items are known in advance. 
Variable 2 comes next, the state of 
origin of the questionnaire, which, with a 
weicht of .8861 yields in combination with 
the previous two--a three-variable scale-- 
a multiple ratio coefficient of .6245. 
{After the second test, the magnitude of 
| the g-weight no longer is an index of the 














l. It will be remember that we are not here by this disgrem enswering either the question "Are short or long ques- 


tions desirable in a questionnaire?" or "Are simple or 


complicated forms of answer response desirable in « question- 


naire?" With the variable quantified, we may hope that the regression snalysis will throw some light on these prob- 


lems. 
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reiative worth of the variables, but worth 
must be inferred only from the increase in 
size of the multiple ratio coefficient. It 
will be remembered that all other variables 
yield lower coefficients when placed in the 
third place in our scale.] Accordingly, 
employing only Variables 5, 14, and 2 we 
can predict, by means of the proper equa- 
tion the returns on a questionnaire in edu- 
cation with considerable efficiency as in- 
dicated by the coefficient .6245. This 
last-added variable is highly important no 
doubt for the reason that the attitude of 
recipients towards answering questionnaires 
is a product of geographical location, as 
well as, no doubt, of excellence of tech- 
nique (including follow-up) employed in 
particular localities. The appropriate re- 
gression equation to use is 


—_ = (1.00) *87Ms _ .9796 ZxemMae 4 gg6) Z27Me, 
° oO. o 14 2 


Variable 13, availability of data 
comes next. As indicated by the footnote 
of Table VII, the negative weight here is 
a product of the method of coding. Clearly 
the result indicates that, the question- 
naire returns are dependent upon "the path 
of least effort" so far as the recipients 
are concerned. One can hardly hope to get 
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unpaid "researchers" to do research for one. 
The next most important variable is 


| Variable 10, the types of question asked. 
| The B-weight is positive. This means that 


any combination with high X'-value receives 
better returns than any combination with 


| low X'-values. Contrary to expectations 





the more complicated forms of response when 
employed in a questionnaire receive the 
higher return, other things (specifically 
the four previous things) being equal. 
"Reasonableness”" having just previously 
been accounted for (Variable 13), the posi- 
tive weight possibly means that it is the 
"thinking" type of question which is “of 
interest" and "of value" to the recipient 
--or at least of such minimal value as to 
lead him to reply rather than to refrain 
from replyine. It will also be noticed 
(Table V) that checking alone (and also as 
a score in the regression equation) yields 
a better return than the yes-no type of 
question. It is the writer's observations 








that many persons attempt to force the re- 
plies of “yes" and "no" to issues to which 
one may not, honestly, answer a categorical 
"yes" or "no." 

The sixth variable (No. 12) is Types 
of Material Requested. This is in part a 
plea for objectivity, on the part of 
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recipients, as versus subjectivity and 
opinion; but, even more so for simplicity 
of mode of answer response. If one must 
change frequently his mode of response, 
particularly if he has many alternations 
from fact-to-opinion-to-fact-to opinion, we 
might guess that the recipient of a ques- 
tionnaire will become disgusted and throw 
the whole questionnaire in the waste basket. 
Such variables as Nos. 10 and 12, where pat- 
terns of response are the issue are of much 
greater difficulty to interpret than the 
simple responses of a simple-categoried va- 
riable such as Variable 7, the form of re- 
production [where it is clear, as shown in 
the footnote below, that typed question- 
naires get the greater response, the other 
things (of this investigation) being equal, 
because--and this is speculation—-the typed 
questionnaires are sent to one's more inti- 
mate, more-professionally-inclined friends 
who have a greater personal obligation to 
reply.}] One can give statistics, names of 
people, schools, etc., quickly and without 
thinking--if available at one's fingers' 
ends. It takes time to make up one's mind 
on issues; some composing ability "to write 
it well", and some mental inertia to be 
overcome to get the thinking process 
started. Besides, one's opinions are some- 
times valued for personal exploitation--for 
a proposed book or article, or for impress- 
ing a superior at the proper time (of an- 
nual promotions, for example) with one's 
worth,--and, accordingly, are not always to 
be had for the mere asking, since all too 
often it is thought that "to give away an 
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idea impoverishes me and is of doubtful 
value to you, since you do not know the con- 
ditions out of which it originated." 

The next variable is the source of the 
questionnaire (No. 1) whether sent by a 
school superintendent, a college professor, 
a research bureau, etc. County departments 
of education get the best returns [we sus- 
pect some "moral" obligation to reply] 
while business firms (including publishers), 
associations and foundations of national 
scope and universities, colleges and normal 
schools, receive the least. It is easy to 
see that compulsion, moral duty to reply, 
intimate knowledge of and friendship with 
the sender are elements in the situation 
here. 

In view of the fact that these seven 
variables yield a multiple ratio coefficient 
of .7137 as the efficiency with which the 
percentage returns canbe predicted by means 
of these seven most important variables, and 
in view of the fact that there is every in- 
dication that the remaining ten variables 
in all would not raise this figure to .73, 
we may consider this selection as adequate 
for the purpose of building a scale to pre- 
dict questionnaire returns.! 

If, in summary, we would seek to ab- 
breviate to a few phrases the seemingly inm- 
portant elements in securing high returns-- 
going beyond our data somewhat in our at- 
tempt--we would be inclined to guess that 
the following, in approximately a descendine 
order of importance are the elements in get- 
ting a high return: 

1. Write the questionnaire on a topic 





1. The next five variables in order to enter the composite, together with their multiple ratio coefficients and 


B-weights are:- 
Multiple ratio 
Variable Coefficients B-weight 
6 Length of time in tenths of a year since the beginning 
of the school year that the questionnaire was issued.* “7188 5560" 
a Number of items questions requested + 7226 --2708 
4 Is author's name included in American Leaders in Education 7251 +2331 
7 Form of reproduction of questionnaire (printed = 1; mimeo- 
qoutnd = Sy ened * 4) -7257 -1184 
8 Number of pages in the questionnaire -7261 -.0948 


*The positive weight of the coded scores, so taken as to rectify the regression and give large coded scores to the 
low tenths, is the equivalent of a negative weight, meaning "the seven previous variables being equal, the earlier 
in the school year the questionnaires are sent the better." The coding table is not reproduced for lack of space. 








March, 1935 


in which the recipients are vitally inter- 
ested themselves in knowing the answer, and 
take pains to exploit this interest to the 
utmost. (Variable 5.) 

2. Send the questionnaire to those few 
people who, because of personal friendship 
and knowledge of your professional repute, 
will feel some personal obligation to re- 
ply. Exploit this by promises of the re- 
sults, and other means. 

3. Employ a vigorous follow-up tech- 
nique, devised to touch upon various motives 
in turn, as viewed from the recipients! an- 
gle. Do not be content to send 40,000 ques- 
tionnaires and receive a reply of as many 
nundreds. (Variable 14.) 

4. Use the best possible technique in 
writing your questions. 

5. Circulate your questionnaire in 
those portions of the country where "reply- 
ing is more than a courtesy, and approaches 
a fixed habit." (Variable 2.) 

6. Don't tax the interest and effort of 
the recipient, but make it easy for him to 
reply. (Variable 13.) Remember that he is 
in your employ only by courtesy. 

7. Use objective unequivocal but 
"sensible" questions. (Variables 10 and 
12.) Do not avoid written answers; but be 
chary of “essay” answers. 

8. Employ advisedly such incidental 
pressures as “moral obligation to reply." 
(Variable 1.) 

9. Send your questionnaire early in the 
school year, before the pressure of other 
duties decreases its chances of receiving 
attention. (Variable 6.) 

10. Don't worry about the length of the 
questionnaire, (Variables 9 and 8) if all of 
the other rules have been followed faithful- 
ly, but remember that length may be a symp- 
tom of slovenly questionnaire technique and 
proceed accordingly. 

An examination of the questions con- 
tained in, or implied in, the analysis of 
the list of seventeen variables obtained on 
these questionnaires reveals the fact that 
perhaps some of the more important variables 
were not obtained for our analysis. In the 
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course of the study the conviction has 
grown that some of the following are, per- 
haps, more important than many of the “ques- 
tions asked" of questiomaire investigations 
in this study: 

1. The number of "follow-ups" employed. 

2. Was the promise-to-reply technique 
used? 

3. Was the relationship of recipients 
to sender one of “moral duty to reply”, 
@.g., such as of any school employee of a 
state in relation to the state director of 
research in education of the same state. 

4. Was the questionnaire anonymous? 

5. Was the investigation "“confiden- 
tial"? 

6. An index of the degree of familiar- 
ity, intimacy, friendliness and profession- 
al interest of the recipients in general 
with respect to the sender. 

7. Size of page used; perhaps area of 
the page used. 

8. Crowding of questions index; per- 
haps average number of questions per page, 
measuring pages in fractions rather than 
integers only, as herein employed. 

9. Were recipients promised "pay" for 
replying; cash, mention in report, extra 
copy of questionnaire, copy of the report, 
or combinations of the above? 

10, Was humor used? 

1l. Were illustrations used? 

12. An index of the average tendency to 
reply of the recipients to the question- 
naires received based upon the state in- 
dices-of-reply-tendency of the several 
states in which the recipients reside. 

13. An index (rating scale) of the ex- 
cellence of the technique employed in the 
questions used. 

At least three of the above variables, 
Nos. 6, 12 and 13, are indices requiring 
some research to devise the index, in ques- 
tion, to be used. With such available it 
is inevitable that the multiple ratio coef- 
ficient of .73 obtained in this investiga- 
tion could be raised considerably. This in 
turn means that the poor questionnaire can 
be spotted ahead of its distribution; the 








1. One questionnaire of 1835 questions, of the original 156 in the N.E.A. investigation was discarded because this 


technique was employed and since the questionnaire did not seem to belong to "the universe" from which the remaining 


135 were sampled. 
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points on which it scores low [see the rat- 
ing scale below] may be remedied by appro- 
priate techniques, in some cases; and the 
proposed conditions of distribution may be 
altered with a view to securing a maximal 
return--in other words the whole process 
can be as well or better controlled than 
any known kind of campaign involving the 
cooperation of others for its success. 

The results of this study have been 
compiled in the form of a rating scale as 
This may be used as a means 
of predicting the percent of questionnaires 
that will be returned in a given case. 


Credit 
|Points 


Constant, K, depending on the combina-| 


tion of tests used when Questions 








| 


1 and 2 only are employed| 2.70 
l, 2 and 3 only " " — |-34.06 
, 2, 3 ond 4 ° " \-14.49 
, 2, 3, 4 and 5 " " |~42.32 
» 2, 3, 4, 5 and 6 bd e | 82.86 
» 2, 3, 4, 5, 6 and 7 " " |-98.36 
(Variable 5) Subject Classification 
of the questionnaLre 
B-weight = 1.0000, W,-weight = | 
8.335007 | 
Clessification of Questionnaire 
11*0l*Teachers Saleries and Profes- 
sional Status 97.13 
8 02 Administration 70.64 
10 O03 Health and Physical Education 88.30 
1 04 Instruction and Organization 
and Spec. Method 8.83 
6 05 Reading 52.98 
9 O06 Finance 79.47 
5 07 Vocational education 44.15 
8 08 Junior High Schools 70.64 
2 09 Special Education 105.96 
7 10 Buildings 61.81 
8 11 Textbooks and Supplies 70.64 
9 12 Geography 79.47 
9 13 Student accounting 79.47 
8 14 Mathematics 70.64 
8 15 Secondary schools 70.64 
9 16 Music 79.47 
9 17 Elementary schools 79.47 
8 18 Kindergarten and Primary 70.64 
10 19 Libraries 88.30 
7 20 Supervision 61.81 
8 21 Counseling 70.64 
9 22 Adult Education 79.47 
ll 25 Rural Education 97.15 
ll 24 Teecher Training 97.15 




















9 26 Theory and Principles of 
education 

13 27 Foreign Languages 

9 28 Home Economics 

12 29 Statistics 
9 
9 
9 


1 | 9 25 Thrift education 
| 


30 Religious Education 
31 English 
32 Curriculum 
ll 33 Extra-curriculer activities 
9 34 Tests and Measurements 
8 35 Commercial Education 
9 36 Handwriting 
| 10 37 Character Educetion 
2 | (Variable 14) Number of copies of 
| Questionnaire issued 
B. = -.9796. We = -.0599038 


| distributed by the multiplier, 
| ~.0399038, end record product in 


3 | (Variable 2) State of origin of 
Questionnaire (determined by ad- 
dress of sender.) 

B, = .8861; Ws = 4.25473 

Only those states are given which 
had one or more questionnaires in 
this investigation. 


17 Ol Alabama 

2 02 California 

8 03 Colorado 

9 04 Connecticut 
15 05 Delaware 

1 06 Dist. of Columbis 
17 07 Georgia 

5 08 Illinois 

1 09 Indiana 

8 10 Iowa 
1l 11 Kanses 

8 12 Kentucky 

10 13 Maryland 

9 14 Massachusetts 


14 15 Michigan 

7 16 Minnesota 

8 17 Missouri 

5 18 Montana 

7 19 Nebraska 

9 20 New Hampshire 
15 21 New Jersey 

8 22 New York 

3 23 North Carolina 
9 24 Ohio 

8 25 Oklahoma 

2 26 Oregon 








|Rule: Multiply the number of copies 


credit column, (minus sign attached). 
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Credit | 


Points 


79.47 | 


79.47 
114.79 
79.47 
105.96 


79.47 


79.47 
79.47 
97.13 
79.47 


70.64 


73.47 
88.30 


72.38 | 


8.51 
34.04 
38.29 
63.82 

4.26 
72.33 
21.27 

4.26 
34.04 
46.80 
54.04 
42.55 
38.29 
59.57 
29.78 
54.04 
21.27 
29.78 
38.29 
63.82 
34.04 
12.76 
38.29 
34.04 

8.51 








* The rightmost of the two starred figures throughout is the original coded score, while the leftmost is the final 


quantified (and rectified) score, which multiplied by the gross score or W-weight yields the credit for the answer 
at issue as recorded in the "Credit Points" column. 
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iim 4 
| Copy a s| Sopy 
| Credit |Credit| | $4 ‘edit | Credi 
| Points here & 5| Points ere 
+ + 4 4 4 
27 Pennsylvanie | 29.78 | | 6 O. No subjective deta 
28 Tennessee 68.08 1. Discussion of subjective 
29 Texas 29.78 | opinion 
Virginia 29.78 2. Opinion 
41 Nashington 8.51 3. Score according to following 
4 32 Wisconsin 49.57 table:- 
Daneds 42.55 Xx! No.of objec- No.of subjec- 
t | tive types tive types 
Variable 1 Aveilability of te l 17.4 
euested rs 4.94 
-.6407. Wg = -3.34505 1L§ l C 2.4% 
1 1 Easy -5.34 1s 1 1 45.48 
4 2 Possible -~-13.38 ll 1 é 58.44 
% Hard -26.76 ll é 8.44 
15 4 Unreasoneble -50.18 1¢ z 1 4.94 
} } | 7 é é 24.4¢ 
Veriable 10) Types of Question Asked 7 : 0 24.46 
Ps «6702 Ws = 2.58169. 4 s 1 13.98 
Comb. ] 3 2 ° 
xX’ No. ] T 1 
l l Yes-end-no 2.58 | 7 | (Variable 1) Source of the question- 
6 2 Check | 15.49 naire. fy = .4843. Wy = 3.22812. 
? $8 Yes-and-no and check 18.07 9 1A City Supt. of Schools 29.05 
10 4 Short or long written an- 5 2 A City school employee writing 
swers | 25.82 in official capacity (including 
l 5 Yes-and-no and short or city teachers organizations 16.14 
long written enswers | 38.72 2 3 A university, college, or nor- 
14 6 Check end short or long mél school 6.46 
written answers | 36.14 | 2 4 An association or foundetion | 
] 7 Yes-end-no, check, cnd | of national scope 6.46 
short or long written an- - 5 A non-collegiete privete school | ---- 
swers | 2.58 - 6A U.S. Government Agency } ae 
} } 7 7 A state governmental agency 22.60 | 
(Varieble 12) Types of material re- 15 8 A county department of educa- | 
quested. Bs = 5806. We = 5.49451. | tion | 48.42 
lL. Count the number, from 0 to 3 of | 9 A business firm (including pub- | 
different objective types of mate- | lishers) | 6.46 
rial requested: | 5 10 A private individual, no of- 
0. No objective deta | ficial status given 16.14 | 
l. Statistics | ) 
2. Lists of names, schools, etc. | | | Algebraic Total = 
3. Other objective date. | il} | (Predicted Per- 
2. Count the number, from 0 to 2 of | centage Return) 
different subjective types | | | 
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THE REVERSIBILITY OF PROOF 






by 
W. Line 
and 
H. B. Hedman 


I. 





INTRODUCTION 


In a previous paper, (2) certain of 
the basic theorems underlying Spearman's 
factor theory were presented in a manner 
that can be appreciated by those unacquainted 
with the hicher mathematics. Among them 
was included the proof that when each of 

four variables can be divided into two fac- | 
tors, the one being common to all four vari- | 
ables, whilst the other is in each case | 
specific and independent, the tetrad equa- 
tion holds. We shall here take up the re- 
verse problem, namely, as to whether, when 
the tetrad criterion is satisfied, then 
every variable would necessarily be divisi- | 
ble into the said two factors. The origi- 
nal sdlutions were achieved by Garnett, (1) | 
and by Spearman, (3) The discussion which | 
follows constitutes a simplification of 
Section I, 3 of the Appendix, (The Abili- 
ties of Man, (4) ) and involves theorems 
presented by Spearman in 1913 (5) and 

1922 (3). These theorems are also restated 
as they are employed in the present paper. 











II, THE REVERSIBILITY OF PROOF 


The tetrad equation may be represented 
as follows: 


Tay oles = Taw -Tys = Tas -Tye (1) 


This is given. Our task is to show that 
each of the four variables x, y, w, z is 
necessarily divisible into two factors, one 
of which is common to all four variables, 
the other being specific. Reference to Fig.l 
will make the setting of equation 1 quite 
clear. 
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ie Appendix refers to the Appendix in the Abilities of Man, (4) except where otherwise indicated. 
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where q, Pp, are other variables in an equi- 
proportional table of correlations, (1.e., 
a table in which the tetrad criterion is 
satisfied throughout). 
Therefore Tyxy = Ags -Tyz, Where Ax; is 
a constant, whilst y takes all values ex- 
cept x or z. (Equation 6, Appendix, p.iii.) 
Our question, then, is tantamount to 
asking whether, on assuming equation 6, 
each of the variables involved, say a, can 
be reduced to the form: 


a=f,g¢+d, (7) (Appendix, p.iii.) 
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wnere 1. fa, fp, otc., are constant for all 2 
| tag - f, rg 
particular values of a, b, etc., oe “aT 
2. g¢ is an element common to all the VN yEe? 
variables, ¢ 
3. da, dp, etc., are uncorrelated with | - 7ae f,t 
Bs re* \Nizg? 
4. de, dp, etc., are uncorrelated with | N\ N 
eagh other. : o,2ag f, re 
These four conditions must be shown to be | = -- ——— 
atisfied, if the "reversibility" theorem | 0 No, \N 
olds. 
With regard to conditions 1 and 2, any | = oles - f,0,- 


f the variables may be written in the form 
‘(ven in equation 7, so as to satisfy these | But since the units are so chosen that 
conditions. Indeed, we may give to f, and 


to ¢ any values we please, so long as % =o =o, = 1, 
ad, =a- fae. | the third condition demands that 
The third condition demands that O=r, - f,; 
Tag (fae) ~ ° = T(a-fge) (fae) or that f, = Tag, 


___2(a-f,e)(f,e) (deviation ePeig: UR OS Big BM Oye 


‘ 2 2] 
Sa-f,¢) 2(f, 2) | form) The first three conditions can accord- 
fe | ingly be satisfied for any set of variables 
f,rag - f, 2g | whatever. 


letentual wie \)2 The fourth condition,--namely, that 
we 8!) | de, dp, etc., are uncorrelated with each 


other,--demands that 














tag - f,re" 
2 Joe? VARECEF | 0 = Ta 4, 
Now d, = & = feg, | Or, since a = fag + d4,, b= fp,g + d,, 
‘ | 
and 04, = “. | ite T(e-£,8) (bf) 
__2a-fag) (d-fpB) (deviation 
Hence f(a - f,2) = sd. = Nog,- = (ra-t,2)* xd-f,2)*]? form ) 





The third condition consequently de- 


- = + e 
mands that Tab-farbe-fpragtfaf preg 





tag - f,2g" . [(2a2) (2a%)]? 





he ee ie 
2 
V2g? VNoa, Now, if we choose our units so that 


But, by choosing the units so that 


O, =O = 7% °e, =l, 
Oa = O = seeeeee = OF = 1, we may write this fourth condition thus: 


the third condition demands that 
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Lab f,xbe 


= Calo 7 


Or, substituting the values for ro, Tag 
found necessary in condition 3, 


0 


Tap ~ fafp — frofa + fafr 


Tab 7 f, f, 


Ra = Pig thy +60: (8) (Appendix, p. iv.) 


Our task now becomes that of proving 
this equation to be true, in which case the 
fourth condition will be satisfied. 

In doing this, Spearman makes use of 
the equation for the correlation between 
sums. We shall accordingly interrupt our 
analysis of the "reversibility of proof" at 
this point, in order to show how the formula 
for determining the correlation between sums 
is derived. This takes us to a considera- 
tion of Spearman's contribution in 1913. (5) 


Correlation between Sums. Using the symbols 
employed by Spearman, let our task be to de- 
rive an equation expressing the correlation 
between the sum of a variables, and the sum 
of b variables; the number of a variables 
rangine from 1 to p, the number of b vari- 
ables ranging from 1 to q, with N cases in 
each variable. (See Fig. 2.) 


Fig. 


a group (scores) 








2 


f, Sag catele 


e «4 - tr. i 
(xb* xg? )? (sa*zg”)* = (sg? =¢?)? 


f.Ta, ” f, f, ° 

The columns indicate the various variable 
the rows the various cases in each variable, 
Thus the first column is the a, variable; 
the first row gives the scores of case nun- 
ber 1 on each of the variables. There 

p a-variables, and q b-variables, witt 
cases in each variable. 


Let S represent the summation by row (hori- 
zontally); let © represent the summatior 
column (vertically). 

p 

Sea expresses the sum of the 
any) row, from the a, to th 
row. Similarly for 


measures in: 


a, of that 


a 
€ a> 


a 

2 be 

Pe expresses the sum of the a measures from 
1 to Nin any a-variable. Similarly for 

N 


Xb. 
1 


z 


~ 


We expresses the summation of N rows of 


p 
Sa's, and therefore the sum of all the cass 


in all of the a-variables. Similarly for 
Ng 
ee: 


Bach 8, 82, As, eeeee A 18 multiplied by 
a constant ni, Na, Ns, coooe N respect ive- 
ly. 


p? 


2 





@., 8, 4s, ay a, teeeee Ap. BL De. Ds. D de, ceeeee Day 
1, G2, 83, G4, As, eeeeee Ap, bi da, ds, Dy. Ds, seeeee Day 
@,, 82, 43, 8%, As, veeee Ap, b, De, D5, d,. °.. reeeee Dy, 
° . . . . . . . . . . . 
aan a2, as, 40 as. eenreree an, ln 2n bs. d. d,. eenreee Dan 
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Bach Bas Das Das veces Dg is multiplied by a constant m,, m,, m,, ..... Mo, respectively. 
(It is, of course, obvious that the correlation between two variables is not affected by 
the introduction of a constant factor in each variable; for the deviations, and the sum 
of the squares of the deviations (standard units) are not thereby influenced. Thus 
P(ny@,) (mb, ) Ta,b,)- 

Sue expresses the summation of any a-row, times n. 


n n" n n n 


mb b-row, times m. 

NP 

Sne : . , " N rows of a's, times n. 
Nq 

=Smb . " " Nrows of b's, times m. 
i 


na 
- - i "average summation of an a-row, times n. 
os 
%Smb " " n n " 2 
— a b-row, times m. 
NP N 
P tSne P Sq 
Sha i 9 expresses the deviation of the Qne for any row from the average summation 7i™. 
N 
rs 
1 Smb 
Sub = 3:*° is a similar expression in the case of the b-group. 
N 


The correlation between the sum of the na-variables and the sum of the mb-variables 
may now be expressed as follows: 





le 8 
NI 14 25 mb 
He) (a. 2 

l = 

(nya, + ngAe +...0. + Np&p)(m,b, + mba + ..... + m,bg) 

[ ar le = + A : 
n|P r"*| \* J 8a eyav| |? 
ye ae 

3 N a J N j J 


A (Numerator ) 
B (Denominator ) 


Let as include every a from 1 to N in every a-variable. 
" " 


" De eee ee "  bevariable. 
" Ns " " n n 2," Np. 
" mM: " " m fn m," m,. 


Expression A (Numerator) may be written 


N ( £4.) § 5S, 
Tr ee a 





Pp 
(since the expression Sn. includes all values of n from 1 to p, which we have called n,; 
and therefore ng, may be treated as a constant. Similarly for m, in the b-group.) 
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Pp 
The aim is to state the above expression for the correlation between sums §, and ay, 


(the pooled scores for individuals) in terms of correlations between the simple elements, 
such as Tab; ° Ta, be 9 teee88 r#Tarbg? Ta,b,? Tabs? eeeeeeeys Tagbg* eeeeee Tandy’ 


id 
Since $a. for an individual = the sum of the values a,, A, ..sseeee. a, for that indi- 


q 
vidual, and Sp Similarly = b, + bet... bg, we may write the Numerator thus: 











N N 
N Ha, + Aat....tAp) z(Dy + De Feeee thy 
NgM tz (a, + as +. s00* ap) _ N (bd, + db. Focce* dq) _ N - 
N N N N N N 
1 ae za || fb, Be ay 
=NeMtt (a,-—)+(a2-—)+....+(a,-—) (b,-——) + (be=- ) tres0* (b won shaiatareranad ) 
i N N N N N _ 2 


- ngmZ(a, + Ge toeeee¢ A) (Dy + De teeeeeet bd) 


(where 8,, Ap sesee a, and b,, bz ...... bg are expressed in deviations from the respec- 
tive means.) 


s 
= nem (a,b, + a,D, teo.et 2,0, + AD, + A,D, +....+ AD, +...4-4 ADD, + ALD, + 


eowocee™ apd, 


= name] rar,Noao,* Ta, by Noh, %,*- coccceet Tay bg Na, °r,* Tab, Noa, op, * Ted, Noe, ,*+ tune Tagby 
Noa, %, * « ee e* Tay bd, Noa,%,* Tap beNo Op,*- ee o* Taphg Nap] 
= NgMtNog OD, Sst Taydy eoeneneeeeeneaeeaeeeeeaewn eevee ee ee een ee eeaeeer een eee eee eeee eee eeaeeeaeneaneaeaeeaeeaneneee (A) 


(where S,, Ta bt indicates the sum of the correlations between every a and every b. There 
are pq such correlations. The expression q, indicates the constant standard deviation of 
each a. The standard deviations have been made equal. A similar interpretation holds 
for %,.) 
Expression A might also be written thus: 
NgM¢Nog %,P Q ek. 6ndeshstdateusenestnds+ovedacunsccdssscceocosesssecoscece GA*t) 


(where F,, is the average of all the pq correlations between the a and b variables.) 
The First Factor of the Denominator (B) may be written 


Np>?\5 
2Si8 8, 
Bett ~ 

N 








N a 
N Z(a, + t. Vencossve™ a) Pp 
= Mgt | (81 + as +..eee00% Ap) = | (since S$, = a, + a +....+ a) 


2 
z 





N 
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nic = EP) + (ap =P) tect (ap 2a" 
L . N N 


ae a 
nafs + Ag tirccccccoe* a,|"\3 (where 8,, &,, ...- a, are expressed as devia- 
4 J tions from respective means) 


mp 
na|za! + maz z 


2 
trccoe® ra, + 58,48, Cc cccee® ra,4, Po ccccee® ra, 4, 
J 


2 as 
ng |t, + N Og tececcesecet Nog. * N Ta, &s Oa, Og,*---- tN Fay apa, a, + 
i 
*? &..4.4.) 
. 28p 82 "P| 


n,N af + 1 Fines? 1 + Tae, Fccee® e Pp Ft. ccev0e* ea 


wily 


Ns N og, (Pp + 2 Sq (gy)> ine 2 eeeeeeeeen ee eeee eee eee eeee eee eeeneeeeeneee (Cc) 
(where 2 S(s)(s1)+ Taga, OxPresses the sum of the correlations between every a and every 
other a. There are p(p-1) of these correlations. The value S(s)(s1) + Tages.) 18 doubled 
because each particular correlation is included twice, as will be seen from the follow- 
ing diagram: 


Expressed in terms of mean correlation, this becomes: 


i 


2(p-1)s r. 2 
n,VN o, ye =a (s)(s-) ste = ng\N o, (2 + (p=-1) > 1s (Cc!) 
—. p (p-1) / : / 


(where T,, expresses the mean of all the correlations between every a and every other 4.) 
By a like procedure, the Second Factor of the Denominator may be reduced to 








wily 


myJN op, (q + 2 S(t) (t-1) The be) eee eee ee eee eee eeeeeeeeeeeaeeeeee (D) 
or to m,/N o, Va (1 + (q = 1) F,,)* COPE EEE HEHEHE EEEHEE (D') 


Returning to the original expression for the correlation between sums, we my, 
therefore, now write: 


n, mM, N a Ss, Tagby 





= 
z 


| nay 0, (P * 2 S(s)(s-1)Ta,e, ) F |ma/W %, (q+ 2S, (t-1)Tb,p,_, ) | 
S; 


= * eeeee EQuation 2, p. 419 in Ref. 5, or Equation lz, 
yp +28, ya +25, p. 424 in Ref. 5. 




















serum i 


beanie 


eS ROMO 


NA IOE ATED) COST 


a iam Ee 





222 JOURNAL OF EXPERIMENTAL EDUCATION Volume III, No. 2 


(where the elements have been reduced to equal standard deviations before combining them, 
and where Sra» has the same meaning as Ss, Ta,bdy? and Sr, has the same meaning as 
Ss)(o-)Tagas_3) 2nd similarly for Sr,,). 

Expressed in terms of mean correlations, we have 


Ny My N o Op, Pd Tar 


re ae 1% ae 
‘ng/No, [POP-1)Fag )* || mY op, YO(1+(a-1)F py ) 





= bs 
2 


= Sao —VoS—SSOooSS eeeeeeee Equation ba DP. 419, or 
Vl +(p-1)Tax ¥2 *+(Q-1) Fy, Equation 13, p. 425, in 
Reference 5. 


(where the elements have been reduced to equal standard deviations.) 

From Equation 2 (above), the case where the standard deviations of the a's, or of 
the b's are unequal, (i.e., where the elements have not been reduced to the same stand- 
ard deviation before combining), is easily provided for, as follows: The required value 
for the right-hand member of the equation would be: 


S (a, of Pq) 
sitet >_> == .cocss Squation 3, .p.. 419, 


VS(og) + 2 Soogty YS(oy) + 2 Slopoyr yy) Ref. 5. 








(It is easily shown that the correlation between sums of series is equal to the 
correlation between the averages of those series. In other words, the correlation be- 
tween the composite scores of a group of individuals is equal to the correlation between 
the averages of those scores. For eee Np . ) 

ia q J 
is, -—li8 t 


a 


























This latter expression is obviously the 


P 
sorrelation between the average S, and the 
i 


a 
average op.) 
1 


The correlation between sum: as given 
by Spearman has been analyzed, with particu- 
lar emphasis upon the formulae basic to the 
arcument of this paper. We may now return 

) the "reversibility of proof.” 

The fourth condition demanded that 
1 = Tap — Tag -Trgs (Appendix, p. iv, equa- 
tion 8; see p. 218, this paper); and this 
jemand is readily seen to be satisfied in 
the case where the set of variables enter- 
ing into the table of coefficients is very 
large. The proof is as follows: 

Let there be a large number (m) of 
x-variables, ranging from variable a to va- 
riable z. These are represented in Fig. 3. 


Fig. 3 


Group of x-variables 


a, b, Cy cece eeeeeceeerceeseeses ZB, 
ao b. Ce eevee eeewveeeneveeeeeeneeneee Z, 
an dn Ch eoeeenvreeeevreeevpeeeeeneen eee Zo 


These variables are expressed, as before, 
in the form a = f,g + d,. The sum of all 
the x-variables, (m of them), between and 
including the limits x = a and x = Z may be 


x=2 


expressed S(x); and this also constitutes 


x=a 


the most representative value for g,--the 
common overlapping of all the x-variables. 
The correlation of any particular variable 
a with the sum of the x-variables includ- 
ing a, (i.e., witha+b*+.... + 2) ex- 
presses, therefore, the correlation between 
aand g. Hence 


Tag < Tie) (a @ D Prcsccece +2)° 








Therefore R4.+« % = ax! 


Yarch, 1935 W. Line and H. B. Hedman 223 


Consequently, from Equation 1 (p. 222, 

above), 

¥ lm Fay 

Tag ae eer = 
¥1+(1l-1)reg \V1+(m1)r yr ye 


(where x' is any one of the x-variables, and 
x" is any other one. Then F,,,.8 expresses 
the average of the correlations between 
every x, (or x') and every other x, (or x").) 
simplifying this equation, we have 
ym - 
by = F = =-- 
 \) 1+(m1)r 


x'x" 


Ym r,;: 
; — (since m is very 
\M Ty" large); 


Similarly, 


r Zr 


> bx! 
& é = 
Kix" 


But, by the postulated tetrad equation, 


| Tep + Tare = Tax + Tyx#, (where x and x* ex- 
| clude respectively a and b. In rye, any x 
| except a may be correlated with any other x 


| 
| 





except b. The value r,, is thus excluded. ) 


Hence rap . Ife = L(ry, - Tye); 
1 


and therefore Ty =Sp——- 2(hx + Tes*) 
xx 
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 . | (Tex Tax) (Trx* ~F,, *)| 
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But since m is large, this value approxi- 
mates to the value it would have if x and 
x* took all values quite independently of 
each other, 1.6., including a and b. 


a = 
Hence r., * —=—= 





— SMe ° Mogi and 
Rex 


therefore 0 = ra. (p. 6 above.) 
The fourth condition is therefore satisfied 


for a large number of variables. 


The four conditions necessary to the 
postulated “reversibility” have now been 
shown to be satisfied. In the case of con- 
dition 4, however, m was assumed to be 
large. That this condition is met without 
the limitation of a large m will be shown in 
a subsequent paper. 
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THE NATURE OF VERBAL AND NON-VERBAL ABILITIES 
by 


Jess H. 


Edds 


Lincoln Memorial University 


Harrogate, 


The interest of the present study was | 
sentered around the problem of differenti- 
able mental abilities with emphasis on the 
"seneral ability” contention of Spearman (7) 
and his followers as opposed by Thorndike (1))! 
wno holds for "special abilities." Through- 
yut the study verbal ability is meant to 
embrace that capacity necessary in compre- 
nending the meaning of words both in and 
out of context and in reading material for 
facts called for, and by non-verbal ability 
is meant that ability necessary in solving | 
-ertain non-verbal problems and analogies | 
not presenting words. The study has at- | 
tempted to answer the following pertinent 
questions: 1. Are verbal and non-verbal 
abilities independent capacities or are 
they the same capacity tested differently? 
2. Do these abilities possess group fac- 
tors? 3. Is the group factor present in the 
non-verbal tests, also present in the verbal 
tests? 4, What is the relationship between 
these abilities and intelligence? 5. How 
should these abilities be weighted in pre- 
dicting school grades? 

The subjects were one hundred forty 
high school students in Peabody Demonstra- 
tion School during the school year of 1929- 
1930, Maturity as an influencing factor 
seems to have been negligible as is indi- 
cated by correlations involving chronologi- 
cal age. The median chronological age was 
fifteen years and five months. The tests 
used to measure the designated abilities 
were as follows: 


| 





VERBAL TESTS 


Haggerty Reading (Sigma 3) 
Whipple College Reading 
Means Hard Opposites 
Inglis Vocabulary 


1. 
2. 
3. 
4. 





Tennessee 


NON-VERBAL TESTs* 


1. International Group Mental Test (Form B) 
2. Geometric Form Test 
3. Atkinson Group Test 


Mental ability was measured by the 
Otis Self Administering Test, Form A. All 
tests were administered by the group method 
and scored by objective keys. The Interna- 
tional Group Mental Test (devised by 
E. A. Doll) is an arrangement of pictures 
into many grouped series such that an item 
in one series matches in some specific way 
an item in the other part of the same series. 
The more advanced sections of the scale 
have a centrally located picture which 
serves as a cue to similarity required. The 
subject matches the items by joining them 
with a pencil mark. 

The Geometric Form Test is a pencil- 
and-paper test adopted from the Mimesota 
Mechanical Aptitudes Test. Each part of 
the test presents some geometric figure in- 
tact and also dissected. The problem is to 
draw lines through the completed figure il- 
lustrating how the different parts should 
be placed to make a similar figure. 

The Atkinson Group Test (devised by 
W. R, Atkinson) consists of four rows and 
four columns of the first sixteen letters 
of the alphabet. The problem is to make 
combinations (24 are possible) of four let- 
ters taking no two letters from the same 
row or colum. 


RESULTS 


A, Calculation.--Intercorrelations were 
calculated by use of the Pearson product- 
moment formula. These correlations are 
shown in Table I, page 226, with reliabil- 
ity coefficients underlined. 








1. Tests one and two of the Non-Verbal group were supplied 


by Joseph Peterson through the National Research Council. 
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TABLE I 


INTERCORRELATIONS BETWEEN THE DIFFERENT TEST ITEMS 








1 2 3 4 5 6 7 8 9 


1. Age -04—.03 —.01—.04—.01 .01 .06—.10 
2. Otis -92 .68 .54 .69 .66 .51 .38 .28 
3. Haggerty Reading -91 .69 .59 .66 .45 .30 .38 
4. Whipple Reading -64 .50 .67 .57 .23 .28 
5. Means Opposites -89 .66 .34 .23 .25 
6. Inglis Vocabulary -86 .34 .24 .28 
7. International Group Mental Test -86 .49 .47 
8. Forms -82 .40 
9. Atkinson -84 





The reliability coefficients were obtained 
by the split halves method and may be con- 
sidered "high." The average for the verbal 
tests is .82 and that for the non-verbal 
tests is .84. The averages of the inter- 
correlations between the four verbal tests, 
the three non-verbal tests, and the correla- 
tions between the two groups of tests are 
as follows: 
VerdGl TOBTS wcccccccccccces 00 
Non-verbal testS .....ecee0. 42 
Verbal with non-verbal tests .31 
Schneck (5) found an average intercor- 
relation of .4920 between five verbal tests, 
-3383 between four numerical tests, and 
.1441 between the verbal and the numerical 
groups. 





The tetrad difference technique was applied 
to correlation coefficients in Table I, 

with the exception of chronological age and 
scores on Otis. One hundred five tetrad 
differences were obtained, although 210 such 
tetrads are possible if both plus and minus 
signs are assigned. The opposite sign is 
not given arbitrarily, but may be obtained 
by arranging any four variables into six 
combinations instead of three. Pearson (4) 
supports the practice of doubling the number 
of tetrads in order to get a smoother fre- 
quency curve. The present study made use of 
the 105 regular tetrads. M; = .0486 and 

PE, = .0524 when Spearman's (7, Appendix, 

p. XI) formula was used for finding the 
probable error of the tetrads. This close 
agreement of M, and PE, is, according to 
Spearman, part of the tetrad criterion sat- 
isfied, whereas Pearson (4) holds that even 
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a one point difference in the second decima) 
figure is a serious deviation when analyzed 
comparatively. All of the 105 tetrads fel) 


within Spearman's (7, footnote, p. 295) ul- 








timate criterion. It is concluded, then, 
that the tests may be thought of as having 
a central factor present in all of the sever 
tests plus a specific factor common to each 
single test and not present in the others. 
This conclusion is based on the Spearman 


technique. 


C. Correlations between V and N-V. Ap- 
parently the specific factors involved have 
little influence since their relationships 
are zero within their probable error cri- 
terion. It was possible to test this ob- 
servation by one other method, namely, the 
correlation between the scores on Verbal 
and Non-verbal tests. In order to complete 
this method, reduced scores for the verbal 
and the non-verbal tests were necessary, 
and by using Woodworth's (14) method the 
four verbal tests were reduced to one, re- 
ferred to as V; likewise the three non-ver- 
bal tests were reduced to one set of scores, 
referred to as N-V. The correlation between 
V and N-V was then found to be .26, which is 
rather low in comparison to correlations be- 
tween verbal material (r = .63) or non-ver- 
bal material (r = .42). This low correla- 
tion does not agree with an arithmetical 
average of the intercorrelations between 
the tests, however the method is in common 
use (2), (5), (4). The method referred to 
combine several scores into one by reducing 
them all to standard scores. The lack of 
correlation between V and N-V is likely due 
to specific factors, therefore it appears 
that V and N-V are in a large measure dis- 
tinct, or at least not the same. The im- 
plication of the study at this point is in 
opposition to the view of Spearman who holds 
that the general factor is the same through- 
out all mental functions, varying only in 
the proportion that particular abilities 
make use of the central ability. Spearman's 
later view admits the possibility of group 
factors. 

In view of the fact that there may yet 
be group factors involved in the tests used, 
one other criterion was applied. For this 
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purpose Kelley's (3) test for group factors 


was employed. The statement is as follows: 
"If the intercorrelations between four va- 
riables are such that t,,;, tie, and 
tiseze = 0, they could conceivably have 
arisen from four variables X,, X,, X;, and 
(, through which was a general factor plus, 
in addition thereto, a second factor common 
to X, and X, or a second factor common to 
(, and X,." Kelley's technique was applied 
to the intercorrelations between Haggerty 
Reading, Means Opposites, International 
Mental Test, and Forms. If a designates a 
ceneral factor common to all four variables, 
b a group factor common to x, and x,, and 
Si, Sey Ss, and 8, the factors specific to 
the respective variables, the factor pat- 
tern may be represented as follows: 


x, (Haggerty Reading) 2*a,a + 6,b + 7,8, 
x, (Means Opposites) =a,a + Bab +Y¥2S82 
x3 (International M.T.) = a3;4 + y38;3 
x, (Forms) = aga + y45,4 


If the variables in the above factor 


pattern are assumed to be in terms of stand-| 


ard deviation measures, the coefficient of 
correlation between a pair of variables is 
equal to the sum of the products of the 
factor loadings of the factor or factors 


common to the two variables. Applying this 


principle, the intercorrelations are as 
follows: 


r,, (Haggerty and Means) =a,a, + 6B, 
r,; (Haggerty and Interna- 
tional M.T.) = aa; 
ri, (Haggerty and Forms) = a,0, 
23 (Means and Internation- 
al M.T.) = 203 
Ta, (Means and Forms) = aga, 
rs, (International M.T. 
and Forms) = ast, 


Tetrad differences formed from the 
above correlations give: 


Ciese = 230,638 2 
Ciess = 239,6,62 
Cisse = 


Theoretically the first two tetrads 
equal each other and the third equals zero. 
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This method was applied to all combinations 
of the four variables, first assuming the 
presence of an additional factor in only 

| two of the variables. But in the last case, 
| to be shown below, an additional factor was 
| assumed in the verbal tests and a different 
| additional factor common to the non-verbal 

| tests but not present in the verbal group. 

| These cases were calculated after the fash- 
| fon of the above method. The actual tetrads 
| calculated from the four variables chosen 

| are as follows: 


Cisse = -1856 
tisss = .1871 
Cisse = 20015 


The first two tetrads may be considered 
equal to each other and the third one equal 
to Zero. Assuming a group factor (b) in 
Haggerty Reading and Inglis Vocabulary and 
an additional group factor (c) in Atkinson 
and Forms, the theoretical results are: 








| 
| these 


= a3046,62 + a,02636, + ByboYs Y« 
Tioga «= 490,662 + 2,066, + PL PoYaY, 
t = 0 
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and the actual tetrads are: 


tisse = 1728 
tiees = - 1800 
tes * 008 

Other combinations of the verbal and the 
non-verbal tests showed essentially the 

same thing. These satisfy the condition 

set up in Kelley's sixteenth proposition. 
Therefore, if all of the tests are thought 
of as having a common general factor, and 
an additional factor present only in the 
verbal tests, the tetrad differences agree 
with the prediction. If we consider the 
additional factor present only in the non- 
verbal tests, the prediction is also satis- 
fied. The theory is also satisfied if an 
additional factor is considered present in 
each of the two types of tests. It follows, 
then, that there is a group factor common to 
the verbal tests or to the non-verbal tests. 
There may even be two group factors; the 

one common to the verbal tests, the other 
common to the non-verbal tests. It seems 
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convincing, however, that there are group 
factors which do not cut across both sets 

of tests, and none of those examined seem 

to cut across both verbal and non-verbal ma- 
terial. 

The original correlation of .26 between 
V and N-V is further substantiated, since 
Kelley's criterion is satisfied. It is not 
concluded beyond all doubt that this is the 
case, but at the present time the converse 
of Kelley's proposition has not been 
proved, 

Since the Spearman technique has been 
employed, it seems advisable to review 
briefly some of its holdings and at the 
same time give the reaction of some of 
Spearman's contemporaries to his viewpoint. 
In 1904 Spearman (8) made his consistent ap- 
peal for “general intelligence." He con- 
cluded that there are “intelligences” and 
also a general universal ability. In 1914 
(9) he gave further evidence of the two fac- 
tor theory by making use of Simpson's and 
Thorndike's data. It was in this study 
that Spearman showed how tetrad differences 
conform to a regular frequency curve. Simp- 
son (6) in his original study, however, 
concluded against Spearman's theory which 
holds that intelligence is to be explained 
on the basis of a hierarchy of mental func- 
tions wherein the amount of correlation in 
each case is due to the degree of connec- 
tion with a common central factor. Pearson 
(4, pp. 289-290) points out that Spearman's 
frequency curves based on tetrad differences 
are by no means symmetrical when exact mathe- 
matics is applied. He finds Spearman's 
probable error in error .001, and the ob- 
served mean in error .005. Pearson further 
contends for an error in. Spearman and Hol- 
zinger's formula for calculating the prob- 
able error of tetrad differences. 

In a review of The Abilities of Man, 
Wilson (13) accuses Spearman of calculating 
intelligence instead of measuring it. Later, 
in a comment on Spearman's (10) "g" factor, 
Wilson (12) holds that the equation r,, = 1, 
also isnot a single equation, but as many 
equations as there are subjects taking the 
test. Quoting Wilson: "just as x* + y* 

+ z* = 0, it forces x = 0, y = 0, 2 = 0." 
(12, p. 223.) 
Asher (2) gave certain tests to 805 
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college freshmen and found that tetrad dir- 
ferences fell within 3 P.E. His tests, cor- 
related with school marks, showed r = .605 
and r = .580 for 1926 and 1927, respectively, 
Asher concluded that if tests are made to 
depend on "g" a higher correlation is found 
between such tests and scholarship. 

Only a few representative discussions 
dealing with the two contradicting concep- 
tions of the nature of intelligence have 
been mentioned. It is seen that these views 
are not reconciled to each other. 


D. The Relation of V and N-V to Intel- 
ligence and Scholastic records. The low 
correlation, r = .26, between V anda N-V 
raises the question of relationship between 
these two measures and mental ability and 
school grades. Reduced scores were calcu- 
lated for school grades and for intelligence 
test scores. The correlation coefficients 
are as follows: 











1. Verbal ability r.. * 
2. Non-verbal ability | r,, = .38 
3. Intelligence Tes = .42 
4. School grades | Ta, = -40 


The subscripts are in keeping with the 
numbers for the tests. 

It is seen that V and N-V correlated 
practically the same with school grades. It 
may be said, then, that the non-verbal tests 
combined are as good a measure of success in 
school as a combination of the verbal tests. 
The higher relationship between V and intel- 
ligence may be real although the verbal 
elements in the Otis test are in part re- 
sponsible for the twelve points advantage. 

The foregoing results seem to justify 
the following tentative conclusions in an- 
swer to the questions raised in the begin- 
ning: 1. Verbal and non-verbal abilities 
seem to be rather different capacities show- 
ing low relationship to each other. 2. Either 
V or N-V contains a factor not present in 
the other. 3. A common group factor does 
not seem to be present to the same degree 
in both verbal and non-verbal material. 

4. Mental ability, as measured by the Otis 
S. A., correlates twelve points higher with 
V than with N-V. 5. V and N-V have prac- 
tically equal weight in predicting class 
scores. 
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MEASUREMENT OF INFANT BEHAVIOR? 
by 
Helen Thompson 
Clinic of Child Development 
Yale University 


We tend to think of egrowth in terms of 
increase and differentiation. The infant 
shows definite increase in his ability to 
attain the upright position, in the variety 
of tasks which he can perform, and in gener- 
al mastery of his environment. Scales for 
measuring growth accordingly have been com- 
posed of items of behavior of ascending or- 
der of difficulty and versatility as have 
been the tests for older children and 
adults. Pintner,® in his book on "Intelli- 
gence Testing", reflecting the general atti- 
tude regarding mental measurement lists the 
criteria of a test of intelligence and in- 
cludes, "Criterion 2. Increasing ability at 
successive age levels", and says, "Obviously 
if a test or scale fails to show this in- 
crease we can get no measure of the child." 
Of course if a test is composed of elements 
which do not increase in frequency of occur- 
rence at ascending age levels, the total 
score accordingly will not increase. 

Careful and detailed study of infant 
behavior and its manner of development has 
revealed some further characteristics and 
trends which suggest that another method of 
evaluating growth levels should be devised 
which will utilize more fully the facts of 








development and thereby give us a more valid, 
accurate, and finely graded measuring device, 

The complete findings of the study are 
detailed in a recent monograph.*® Here, it 
will suffice to say that the group of in- 
fants* studied was a highly homogeneous one 
with respect to social-economic status and 
race.® Age level distinctions were there- 
fore defined even more sharply than they 
would have been had twice the number of 
cases been examined. 

For this study, the infant who had 
been brought to the clinic by his mother, 
was taken to the photographic dome or exam- 
ining room, and, placed nude on the crib 
platform, was presented in a specified man- 
ner with the simple objects designed to 
elicit his behavior. The examiner,® who 
stood by his left side, dictated in a natu- 
ral but subdued voice, the infant's general 
activity and response directed specifically 
to the stimulus. Naturally the examiner was 
watching for certain behavior which experi- 
ence had led her to expect, but she was al- 
so alert to any detail not previously ob- 
served, and furthermore, she was watching 
for any response which the infant might 
make, whether or not it conformed with the 
expected behavior. 





1. This paper was presented at the Tenth International Congress of Psychology, August 1952. Copenhagen, Denmark. 


~ 


. Pintner, Rudolf, Intelligence Testing. Henry Holt and Co., New York, 1923, p. 62. 


3. Gesell, A. and Thompson, H. assisted by Amatruda, C., Infant Behavior: Its Genesis and Growth. McGraw-Hill Book 


Co., New York, 1934, 343 pages. 


4. At least 26 infants, 135 girls and 15 boys were examined at each age level. With only a few exceptions the records 
were obtained within two days of the stated age; this precision also reduced the veriability of the group and reli- 
ability of the findings. All infants were of normal gestation period as indicated by the prenatal history and birth 
weight. No ill or seriously underweight infants were included. Cinema records of the behavior as well as the 
mother's reports of the infant's activity at home check and confirm the results. 

5. The subjects were all from homes characteristic of the middle social-economic status of the country. The races rep- 
resented were those of northern European extraction so far as this could be established from information concerning 


the nationality of the grandparents. 


6. Eighty-two percent of the examinations were made by the same two examiners, one who examined the infants from 4 
through 12 weeks and the other who examined them from 16 through 56 weeks. 
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At least 60 percent of the infants at 
any age level had been examined in the same 
manner four weeks earlier. Because of the 
rapidity of growth, situations profitable 
for elicitin behavior at early levels were 
necessarily dropped as new and more fitting | 
situations were introduced but this was done} 
so that as far as possible, the continuity 
of the examinations of the various age lev- 
els was preserved. 

Behavior trends were verified by refer- 
ence to the cinema records made at the time 
of observation. 

Data from one situation will serve to 
typify the trends of development observed. 


The Supine Situation 

The nude infant was placed in supine 
position on the crib platform and his gener- 
al body posture and behavior was noted. No 
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further stimulation was afforded; the be- 


231 


havior therefore indicated not what 
fant 


the 
could do but what he did do and it in- 
Jicated only what he did do in this specific 
situation during the time allotted. 

The percentage of the number of cases 
displayine each behavior item at each age 
level was determined. It was evident that 
the trend of development of any specific 
behavior followed one of four courses. 

(1) It increased in frequency of occurrence; 
(2) It decreased; (3) It increased to a cer- 
tain frequency and then decreased; and 

(4) It fluctuated. Items showing constant 
trends were rarely observed. 

The graph shows a sampling of the 
trends which were found. Head predominate- 
ly in midline is behavior seen with increas- 
ine frequency. The curve rises sharply be- 
tween 12 and 16 weeks; after that it indi- 
cates frequency so common that the behavior 
is no longer significant. At early ages 
the head is predominately rotated and this 
turnine of the head furnishes the stimulus 
for the tonic neck reflex posture of the 
arms. These two aspects of behavior are 
related but not interdependent. The prom- 
inent tonic neck reflex position of the 
arms is seen with great frequency early but 
jisappears by 20 weeks, but not because the 
head is no longer predominately on the side; 
when the head is turned to the side, at 20 
weeks the infant either maintains a sym- 
metrical position of his arms and legs or 
he rolls to the side. The rolling to the 
side at 4 weeks is brought about by the 
rounded back of the infant, together with 
considerable almost continuous and often 
abrupt movements which places his center of 
gravity beyond the small area of contact 
with the supporting surface and he conse- 
quently rolls to the side. As the back is 
less rounded the activity decreases but in- 
creases sharply again but in a different 
pattern, as the infant, swinging the legs 
and turning the head, rolls to the side. 
With more careful study two items could be 


?~.., 
ath 


specified. This is probably true for all 
items which show a fluctuating trend. It 
is possible, however, on another basis to 


appraise this fluctuating behavior and 
therefore such an analysis has not been 
made. As the head assumes the midline po- 
sition and the tonic neck reflex disappears, 
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the arms which have been externally rotated 
internally and with semi-flexion at the el- 
bow, the hands come together; mutual finger- 
ing results, but this activity disappears 
when the arms are further extended and 
reach down to grasp the feet which, by 
flexion of the legs at the hip joint, have 
been extended and lifted into the line of 
vision. 

The changing complexity of development 
is at once apparent when the trends of be- 
havior items are thus studied. Behavior 
growth may be properly regarded as an or- 
ranized and systematic changing complex. 
This does not oppose the view that develop- 
ment is signified by increase and differen- 
tiation in ability, instead it emphasizes 
the intricacy of the differentiation and in- 
dicates the importance of responses which do 
not align themselves simply with increasing 
abilities. When an infant is placed in the 
dorsal position and brings his hands to- 
gether clasping them, at once his behavior 
indicates between 8 and 32 weeks maturity. 
This surely has more significance concern- 
ing developmental age than the increasing 
item, head predominately in midline which 
when observed portends only the limit of 
four weeks. For the same reason when hands 
catch feet is observed it indicates the 
stage of development reached more definite- 
ly than would any more permanent item. True, 
we cannot attach adverse significance to 
the absence of this behavior but that is 
not adequate reason for disregarding its 
significance when it is seen. 

It is important to note that while 
each item reflects other aspects of activ- 
ity, complete relationship does not exist 
except when the possibilities of response 
are necessarily dichotomous. For instance, 
while prominent tonic neck reflex position 
assumes dissymmetry of the arms, prominent 
dissymmetry of the arms does not necessarily 
imply presence of the tonic neck reflex. The 
multiplicity of items is not the result of 
listing different aspects of the same re- 
sponse. Each item is unique. 
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Naturally the question is raised 
whether most items could not be worded or 
arranged so that increasing trends would be 
represented. To change the item prominent 
tonic neck reflex attitude into an increas- 
ing item and preserve its import could be 
done but this would mean neglecting what is 
seen and giving value to behavior not seen, 
It is preferable to evaluate what we see 
rather than what we do not see. Further- 
more the presence of this behavior is mean- 
ingful. It indicates maturity of less than t 
20 weeks. 1 
To arrange the nodal items in an as- 
cending series would be to introduce an ar- 
tifact which is scarcely justified; credit 
could not be given for mutual fingering be- 
cause grasps foot is observed; also a minus | 
score for failure to catch hold of the feet 
would not be justified when, at its peak at 
32 weeks, it is seen in only 35 percent of 
the infants whom we have every reason to 
believe represent the normal. It would be 
fully as reasonable to give it a plus rat- 
ing when not observed because 65 percent of 
the cases did not respond in that wayl 
The graph is not misleading when it 
shows that only the minority of behavior 
items steadily increase in frequency. When 
behavior is minutely studied, we find it so 
full of individual patterns that 100 percent 
frequency of any specific response is rela- 
tively infrequent even in healthy normal 
children. It is not the presence or ab- 
sence of any one item of behavior but rath- 
er it is the total complex which precisely 
indicates the child's stage of development. 
The nodal, fluctuating and decreasing items 
are entities of behavior which have unique 
importance from a diagnostic and prognostic 
viewpoint. 
Methods of scoring are being worked 
out which will enable us properly to assess 
such behavior. The details of mathematical 
treatment are still in the process of for- 
mulation and are therefore not ready for 
presentation. Several possible methods are 
being tried out. The problem is by no means 
an insurmountable one. 
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THE CHOICE OF CENTRAL TENDENCY 
by 


Edward A. 


Harvard University 


Most textbooks on educational statis- | 
tics discuss to some extent the factors 
that should be considered when a choice of 
central tendency is made for use in a par- 
ticular investigation. No book, however, 
‘ives anything like a complete list of the 
advantages and disadvantages of the various 
averages, and so the writer has gathered for 
his students, and with their help, the fol- 
lowing statements, 

Yule, in his Introduction to the The- 
ory of Statistics, sets down six basic cri- 
teria which should be considered when any 
problem in the choice of central tendency 
arises. These criteria are set down under 
the first heading. 

The other sections deal with the ad- | 
vantages and disadvantages of the various 
averages. These were found, either definite- 
ly or by implication, in the texts, or were 
brought out in class discussion. The most 
helpful books were King's Elements of Sta- 
tistical Method, and Garrett's Statistics 
in Psychology and Education. 

A study of the statements should make 
clear the fact that no one measure of cen- 
tral tendency can be called the best, with- 
out qualification. The various averages 
tell different facts about the series of 
measures which they are used to represent. 
Sometimes one of these facts is described, 
sometimes another, sometimes a combination 
of two or more. 

The lists of advantages and disadvan- 
tages also reveal that the same quality or 
attribute which may be an advantage in one 
problem, or phase of a problem, is likely 
to be a disadvantage in other circumstances. 
Thus, if it is desired to give weight to 
the extreme measures, it is necessary to 
use an average which does this. If the ef- 
fect of extremes is to be minimized, a dif- 
ferent type of average must be used. The 

















| portant. 


| necessary to compute the medians. 
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fact that the geometric and harmonic means 
have special applications is both advan- 
tageous and disadvantageous. 

Another general consideration is im 
If, as is frequently true, the 
investigator wishes to compare his own re- 
sults with those of previous studies, he 
must usually employ the same measure of 
central tendency as was previously employed. 
Perhaps the most common application of this 
principle is found in the handling of stand- 
ard test results. The makers of the tests 
have practically all adopted the median as 
the measure of central tendency in which to 
express the norms or standards. Thus, when 
it is desired to compare the abilities of 
classes or other groups with norms, it is 


FACTORS FOR CONSIDERATION IN 
THE CHOICE OF CENTRAL TENDENCY 


A. Yule's Criteria for Averages. (p.108ff.) 
1. The average should be rigidly defined, 
and not left to the mere estimation 
of the observer. 
It should be based on all the observa- 
tions made. 
It should not be of too abstract 
mathematical character. That is, it 
should be readily understood. 
It should be calculated with reason- 
able ease and rapidity. 
It should be as little affected as 
possible by the fluctuations of san- 
pling, that is, it should be stable. 
(This criterion applies only when the 
central tendency is used as a basis 
of generalization. It has no point 
if the investigator is concerned with 
the data actually at hand.) 
It should lend itself readily to al- 
gebraic treatment. 


2. 


3. 








6. 
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Advantages of the Arithmetic Mean. 

1. It meets all of Yule's criteria satis- 
factorily. 

2. Its calculation does not require the 
arrangement of the data in any par- 
ticular way. 

3. It gives weight to extreme variations, 
which is desirable in some cases. 

4. It may be calculated when only the 
number of items and their aggregate 
sum are known, 1.6., when information 
concerning the separate items is not 
available. 


Disadvantages of the Arithmetic Mean. 

1. It cannot be determined by inspection 
from a graph or frequency table. 

2. It cannot be determined accurately 
when the extreme measures of a series 
are missing. 

3. It emphasizes extreme variations, 
which is often undesirable. 

4, It cannot be used in the study of in- 

commensurable quantities. 

. It may fall where no data actually ex- 

ist. 


- Advantages of the Median. 


1. It is fairly rigidly defined. 

. It is based on all the measures. 

- Though not as familiar a measure as 
the mean, it is easy to explain and 
understand. 

4. It is easy to calculate. 

5. It does not give excessive weight to 
extreme cases, which is usually desir- 
able. 

6. It can be used when only the number of 
the extreme items is known, even if 
their exact magnitude cannot be ascer- 
tained. 

7. It can be used when the trait or qual- 
ity being studied is not susceptible 
of measurement in definite units. Thus 
it is possible to array a group of 
children according to their status in 
some trait and find a median. 


Disadvantages of the Median. 

1. It can only be found by a special ar- 
rangement of the measures either in a 
distribution table or in serial order. 

2. It is not so reliable or stable as 
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the mean. 

3. It cannot be used when it is desirable 
to give weight to the extreme measures, 

4. It is not susceptible to algebraic 
treatment. 

5. It may fall where there are few or n 
cases. 

6. A correct total of the measures cannot 
be found by multiplying the median by 
the number of cases. 


| F. Advantages of the Mode. 





1. It is very easily found from a dis- 
tribution table. 

2. It is not affected by extreme meas- 
ures. 

3. It cannot fall where no measures ex- 
ist. 

4. It is often the best representation 
of the group. 


G. Disadvantages of the Mode. 
1. It is not rigidly defined. 
f It is not based on all the measures. 
3. It is the least reliable or stable of 
all the averages. 
it. It does not lend itself to algebraic 
treatment. 
5. Often no well-defined mode is present 
in a distribution. 
6. A correct total of the measures can- 
not be obtained by multiplying the 
\ mode by the number of cases. 
\y. It may be determined by comparatively 
few items. 
8. It can only be found by a special ar- 
rangement of the cases in a distribu- 
tion table. 


H, Advantages of the Geometric Mean. 
1. It is rigidly defined mathematically. 
2. It is based on all the measures. 
3. It is fairly stable or reliable. 
4. It is the only average which may be 
used legitimately in some cases. 


I, Disadvantages of the Geometric Mean. 
1. It is not readily understood. 
2. It is hard to calculate. 
3. It is useful only in special cases. 


J. Advantages of the Harmonic Mean. 
1. It is rigidly defined mathematically. 





March, 1935 Edward A. 
t is based on all the measures. 

t is fairly stable or reliable. 

t is the only average which may be 
used lercitimately in some cases. 


Disadvantages of the Harmonic Mean. 

1. It is not readily understood. 

2. It is hard to calculate. 

3. It is useful only in special cases. 


L. Advantages of the Midscore. 

1. It is based on all the measures. 
It is readily understood. 
. It is easy to calculate for a small 
group, like a single class in school. 
It does not give excessive weight to 
extreme measures, 
It can be used even if the exact mag- 
nitude of the extreme cases is not 
known. 
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M. Disadvantages of the Midscore. 
1. It is not rigidly defined. 
2. It is not as reliable or stable as 
the arithmetic mean. 
3. It can be found only by a special ar- 





rangement of the measures. 

It does not give weight to the extreme 
measures. 

It is not susceptible to algebraic 
treatment. 

A correct total cannot be obtained by 
multiplying the midscore by the num 
ber of cases, 

It cannot be found when only the num 
ber of cases and their total mgni- 
tude is known. 
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Educators are becoming more selective 
in their choice of texts and supplementary 
books. Price and authorship are not the 
only factors entering into choosing a book 
for a given ‘grade level. Such items as 
method of presentation, organization, i1l- 
lustrations, indexing, format, and vocabu- 
lary are now being given careful considera- 
tion. This note is concerned only with the 
measurement of the vocabulary content of a 
given book. The results of vocabulary 
studies should be considered as but a part 
of the information needed in selecting a 
text. Consequently, undue emphasis should 
not be given to measures of (1) vocabulary 
difficulty, (2) vocabulary diversity, and 
(3) vocabulary interest. 

The basis of the technique for deter- 
mining measures of difficulty and diversity 
is a 20" x 28" sheet. One side presents 
the 500 most important words in the English 
language in an alphabetical arranrement with 
spaces for writing in other words. The 
words in a sample of 1000 running words 
from a textbook are checked or recorded on 
this form. From this record, one can easi- 
ly obtain for each alphabetical group of 
words (1) the number checked in the 500 
word list, (2) the number of wores written 
in, and (3) the total number of words. From 
these data, measures of vocabulary diffi- 
culty and vocabulary diversity are obtained. 
These are translated into grade placements. 
Directions for these steps are given on the 
other side of the tabulation sheet. 

Vocabulary difficulty is a measure of 
the technical or special meaning words used 
by the author. Books rating high in vocab- 
ulary difficulty will contain many words of 
low frequency that are frequently derived 
from Greek and Latin sources. The difficul- 
ty grade placement secured indicates the de- 
gree of reading comprehension needed as 





| measured by a standardized reading test. On 
| the basis of five years of experience the 

| norms for vocabulary difficulty were revised 
in February, 1935. The grade placements 
have a reliability of .93. 

Vocabulary diversity is a measure of 
the variety or range of words used without 
respect to their difficulty. It indicates the 
verbosity of wordiness of an author. The vo- 
cabulary diversity grade placement is sec 
ondary to the vocabulary difficulty grade 
placement. Its chief use is in comparing two 
or more books that have approximately the 
same vocabulary difficulty grade placement. 
It is generally true that popular stories 
are low both in vocabulary difficulty and 
diversity. High-grade literature tends to be 
low in difficulty but high in diversity. Sci- 
entific books generally are high in both dif- 
ficulty and diversity. The grade placement 
of popular literature seems to be between 
fifth and sixth grade difficulty. 

Vocabulary interest is a measure of pic- 
ture or image bearing words used. Books that 
use few colorful, sensory words are apt to 
lack interest. Books that children read with 
great delight contain a relatively high per- 
centage of image bearing words. The measure 
of vocabulary interest is now undergoing ex- 
pansion which will serve to increase the re- 
liability through securing a better sampling. 

The technique is useful in selecting 
books within the reading comprehension lev- 
el of students. Particularly is this true 
with texts for use with dull over-age pu- 
pils who have mature reading interests and 
a low comprehension level. In high school 
it is possible to recommend texts in the 
same fields where the students are taught 
according to their mental ability. The 
technique is one more aid to administrators 
in adapting the school to the child. 

















